Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department

Fatemeh Shah-Mohammadi; Joseph Finkelstein

doi:10.3390/diagnostics14161779

Diagnostics (Aug 2024)

Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department

Fatemeh Shah-Mohammadi,
Joseph Finkelstein

Affiliations

Fatemeh Shah-Mohammadi: Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA
Joseph Finkelstein: Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA

DOI: https://doi.org/10.3390/diagnostics14161779
Journal volume & issue: Vol. 14, no. 16
p. 1779

Abstract

Read online

In emergency department (ED) settings, rapid and precise diagnostic evaluations are critical to ensure better patient outcomes and efficient healthcare delivery. This study assesses the accuracy of differential diagnosis lists generated by the third-generation ChatGPT (ChatGPT-3.5) and the fourth-generation ChatGPT (ChatGPT-4) based on electronic health record notes recorded within the first 24 h of ED admission. These models process unstructured text to formulate a ranked list of potential diagnoses. The accuracy of these models was benchmarked against actual discharge diagnoses to evaluate their utility as diagnostic aids. Results indicated that both GPT-3.5 and GPT-4 reasonably accurately predicted diagnoses at the body system level, with GPT-4 slightly outperforming its predecessor. However, their performance at the more granular category level was inconsistent, often showing decreased precision. Notably, GPT-4 demonstrated improved accuracy in several critical categories that underscores its advanced capabilities in managing complex clinical scenarios.

Published in Diagnostics

ISSN: 2075-4418 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine: Medicine (General)
Website: http://www.mdpi.com/journal/diagnostics

About the journal

Abstract

Keywords