Pediatric Discovery (Dec 2023)

Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases

  • Qiuhong Wei,
  • Yanqin Wang,
  • Zhengxiong Yao,
  • Ying Cui,
  • Bo Wei,
  • Tingyu Li,
  • Ximing Xu

DOI
https://doi.org/10.1002/pdi3.42
Journal volume & issue
Vol. 1, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract With the advance of artificial intelligence technology, large language models such as ChatGPT are drawing substantial interest in the healthcare field. A growing body of research has evaluated ChatGPT's performance in various medical departments, yet its potential in pediatrics remains under‐studied. In this study, we presented ChatGPT with a total of 4160 clinical consultation questions in both English and Chinese, covering 104 pediatric conditions, and repeated each question independently 10 times to assess the accuracy of its responses in pediatric disease treatment recommendations. ChatGPT achieved an overall accuracy of 82.2% (95% CI: 81.0%–83.4%), with superior performance in addressing common diseases (84.4%, 95% CI: 83.2%–85.7%), offering general treatment advice (83.5%, 95% CI: 81.9%–85.1%), and responding in English (93.0%, 95% CI: 91.9%–94.1%). However, it was prone to errors in disease definitions, medications, and surgical treatment. In conclusion, while ChatGPT shows promise in pediatric treatment recommendations with notable accuracy, cautious optimism is warranted regarding the potential application of large language models in enhancing patient care.

Keywords