Applied Sciences (Sep 2024)

Assessing the Capability of Advanced AI Models in Cardiovascular Symptom Recognition: A Comparative Study

  • Jordi Cusidó,
  • Lluc Solé-Vilaró,
  • Pere Marti-Puig,
  • Jordi Solé-Casals

DOI
https://doi.org/10.3390/app14188440
Journal volume & issue
Vol. 14, no. 18
p. 8440

Abstract

Read online

The field of medical informatics has been significantly transformed in recent years with the emergence of Natural Language Understanding (NLU) and Large Language Models (LLM), providing new opportunities for innovative patient care solutions. This study aims to evaluate the effectiveness of publicly available LLMs as symptom checkers for cardiological diseases by comparing their diagnostic capabilities in real disease cases. We employed a set of 9 models, including ChatGPT-4, OpenSource models, Google PaLM 2, and Meta’s LLaMA, to assess their diagnostic accuracy, reliability, and safety across various clinical scenarios. Our methodology involved presenting these LLMs with symptom descriptions and test results in Spanish, requiring them to provide specialist diagnoses and recommendations in English. This approach allowed us to compare the performance of each model, highlighting their respective strengths and limitations in a healthcare context. The results revealed varying levels of accuracy, precision, and sensitivity among the models, demonstrating the potential of LLMs to enhance medical education and patient care. By analysing the capabilities of each model, our study contributes to a deeper understanding of artificial intelligence’s role in medical diagnosis. We argue for the strategic implementation of LLMs in healthcare, emphasizing the importance of balancing sensitivity and realism to optimize patient outcomes.

Keywords