Digital Health (May 2024)

Safety and quality of AI chatbots for drug-related inquiries: A real-world comparison with licensed pharmacists

  • Yasser Albogami,
  • Almaha Alfakhri,
  • Abdulaziz Alaqil,
  • Aljawharah Alkoraishi,
  • Heba Alshammari,
  • Yasmin Elsharawy,
  • Abdullah Alhammad,
  • Abdulaziz Alhossan

DOI
https://doi.org/10.1177/20552076241253523
Journal volume & issue
Vol. 10

Abstract

Read online

Introduction Pharmacists play a pivotal role in ensuring patients are administered safe and effective medications; however, they encounter obstacles such as elevated workloads and a scarcity of qualified professionals. Despite the prospective utility of large language models (LLMs), such as Generative Pre-trained Transformers (GPTs), in addressing pharmaceutical inquiries, their applicability in real-world cases remains unexplored. Objective To evaluate GPT-based chatbots’ accuracy in real-world drug-related inquiries, comparing their performance to licensed pharmacists. Methods In this cross-sectional study, authors analyzed real-world drug inquiries from a Drug Information Inquiry Database. Two independent pharmacists evaluated the performance of GPT-based chatbots (GPT-3, GPT-3.5, GPT-4) against human pharmacists using accuracy, detail, and risk of harm criteria. Descriptive statistics described inquiry characteristics. Absolute proportion comparative analyses assessed accuracy, detail, and risk of harm. Stratified analyses were performed for different inquiry types. Results Seventy inquiries were included. Most inquiries were received from physicians (41%) and pharmacists (44%). Inquiries type included dosage/administration (34.2%), drug interaction (12.8%) and pregnancy/lactation (15.7%). Majority of inquires included adults (83%) and female patients (54.3%). GPT-4 had 64.3% completely accurate responses, comparable to human pharmacists. GPT-4 and human pharmacists provided sufficiently detailed responses, with GPT-4 offering additional relevant details. Both GPT-4 and human pharmacists delivered 95% safe responses; however, GPT-4 provided proactive risk mitigation information in 70% of the instances, whereas similar information was included in 25.7% of human pharmacists’ responses. Conclusion Our study showcased GPT-4's potential in addressing drug-related inquiries accurately and safely, comparable to human pharmacists. Current GPT-4-based chatbots could support healthcare professionals and foster global health improvements.