Archives of Academic Emergency Medicine (Jul 2024)

Diagnostic Accuracy of ChatGPT for Patients’ Triage; a Systematic Review and Meta-Analysis

  • Navid Kaboudi,
  • Saeedeh Firouzbakht,
  • Mohammad Shahir Eftekhar,
  • Fatemeh Fayazbakhsh,
  • Niloufar Joharivarnoosfaderani,
  • Salar Ghaderi ,
  • Mohammadreza Dehdashti,
  • Yasmin Mohtasham Kia,
  • Maryam Afshari,
  • Maryam Vasaghi-Gharamaleki,
  • Leila Haghani,
  • Zahra Moradzadeh,
  • Fattaneh Khalaj ,
  • Zahra Mohammadi,
  • Zahra Hasanabadi,
  • Ramin Shahidi

DOI
https://doi.org/10.22037/aaem.v12i1.2384
Journal volume & issue
Vol. 12, no. 1

Abstract

Read online

Introduction: Artificial intelligence (AI), particularly ChatGPT developed by OpenAI, has shown the potential to improve diagnostic accuracy and efficiency in emergency department (ED) triage. This study aims to evaluate the diagnostic performance and safety of ChatGPT in prioritizing patients based on urgency in ED settings. Methods: A systematic review and meta-analysis were conducted following PRISMA guidelines. Comprehensive literature searches were performed in Scopus, Web of Science, PubMed, and Embase. Studies evaluating ChatGPT's diagnostic performance in ED triage were included. Quality assessment was conducted using the QUADAS-2 tool. Pooled accuracy estimates were calculated using a random-effects model, and heterogeneity was assessed with the I² statistic. Results: Fourteen studies with a total of 1,412 patients or scenarios were included. ChatGPT 4.0 demonstrated a pooled accuracy of 0.86 (95% CI: 0.64-0.98) with substantial heterogeneity (I² = 93%). ChatGPT 3.5 showed a pooled accuracy of 0.63 (95% CI: 0.43-0.81) with significant heterogeneity (I² = 84%). Funnel plots indicated potential publication bias, particularly for ChatGPT 3.5. Quality assessments revealed varying levels of risk of bias and applicability concerns. Conclusion: ChatGPT, especially version 4.0, shows promise in improving ED triage accuracy. However, significant variability and potential biases highlight the need for further evaluation and enhancement.

Keywords