Diagnostics (Aug 2024)

Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department

  • Fatemeh Shah-Mohammadi,
  • Joseph Finkelstein

DOI
https://doi.org/10.3390/diagnostics14161779
Journal volume & issue
Vol. 14, no. 16
p. 1779

Abstract

Read online

In emergency department (ED) settings, rapid and precise diagnostic evaluations are critical to ensure better patient outcomes and efficient healthcare delivery. This study assesses the accuracy of differential diagnosis lists generated by the third-generation ChatGPT (ChatGPT-3.5) and the fourth-generation ChatGPT (ChatGPT-4) based on electronic health record notes recorded within the first 24 h of ED admission. These models process unstructured text to formulate a ranked list of potential diagnoses. The accuracy of these models was benchmarked against actual discharge diagnoses to evaluate their utility as diagnostic aids. Results indicated that both GPT-3.5 and GPT-4 reasonably accurately predicted diagnoses at the body system level, with GPT-4 slightly outperforming its predecessor. However, their performance at the more granular category level was inconsistent, often showing decreased precision. Notably, GPT-4 demonstrated improved accuracy in several critical categories that underscores its advanced capabilities in managing complex clinical scenarios.

Keywords