Large language model triaging of simulated nephrology patient inbox messages

Justin H. Pham; Charat Thongprayoon; Jing Miao; Supawadee Suppadungsuk; Supawadee Suppadungsuk; Priscilla Koirala; Iasmina M. Craici; Wisit Cheungpasitporn

doi:10.3389/frai.2024.1452469

Frontiers in Artificial Intelligence (Sep 2024)

Large language model triaging of simulated nephrology patient inbox messages

Justin H. Pham,
Charat Thongprayoon,
Jing Miao,
Supawadee Suppadungsuk,
Supawadee Suppadungsuk,
Priscilla Koirala,
Iasmina M. Craici,
Wisit Cheungpasitporn

Affiliations

Justin H. Pham: Mayo Clinic College of Medicine and Science, Mayo Clinic, Rochester, MN, United States
Charat Thongprayoon: Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
Jing Miao: Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
Supawadee Suppadungsuk: Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
Supawadee Suppadungsuk: Faculty of Medicine Ramathibodi Hospital, Chakri Naruebodindra Medical Institute, Mahidol University, Samut Prakan, Thailand
Priscilla Koirala: Department of Internal Medicine, Mayo Clinic, Rochester, MN, United States
Iasmina M. Craici: Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States
Wisit Cheungpasitporn: Department of Nephrology and Hypertension, Mayo Clinic, Rochester, MN, United States

DOI: https://doi.org/10.3389/frai.2024.1452469
Journal volume & issue: Vol. 7

Abstract

Read online

BackgroundEfficient triage of patient communications is crucial for timely medical attention and improved care. This study evaluates ChatGPT’s accuracy in categorizing nephrology patient inbox messages, assessing its potential in outpatient settings.MethodsOne hundred and fifty simulated patient inbox messages were created based on cases typically encountered in everyday practice at a nephrology outpatient clinic. These messages were triaged as non-urgent, urgent, and emergent by two nephrologists. The messages were then submitted to ChatGPT-4 for independent triage into the same categories. The inquiry process was performed twice with a two-week period in between. ChatGPT responses were graded as correct (agreement with physicians), overestimation (higher priority), or underestimation (lower priority).ResultsIn the first trial, ChatGPT correctly triaged 140 (93%) messages, overestimated the priority of 4 messages (3%), and underestimated the priority of 6 messages (4%). In the second trial, it correctly triaged 140 (93%) messages, overestimated the priority of 9 (6%), and underestimated the priority of 1 (1%). The accuracy did not depend on the urgency level of the message (p = 0.19). The internal agreement of ChatGPT responses was 92% with an intra-rater Kappa score of 0.88.ConclusionChatGPT-4 demonstrated high accuracy in triaging nephrology patient messages, highlighting the potential for AI-driven triage systems to enhance operational efficiency and improve patient care in outpatient clinics.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords