Journal of Primary Care Specialties (Apr 2024)

Addressing High-value Care with Generative Pretrained Transformer 4

  • Jassimran Singh,
  • Aditi Agrawal,
  • Navya Reddy Perkit

DOI
https://doi.org/10.4103/jopcs.jopcs_6_24
Journal volume & issue
Vol. 5, no. 2
pp. 108 – 111

Abstract

Read online

Background: High-value care emphasizes services offering significant health benefits, aligning with patient preferences while minimizing costs and interventions of little benefit. This approach is increasingly vital in a healthcare environment constrained by finite resources and rising costs. Large Language Models (LLMs) like Generative Pretrained transformer-4 (GPT-4), with their vast data processing capabilities, offer a promising avenue for supporting healthcare providers in making evidence-based, high-value care decisions. Aims: This study aims to evaluate the performance of OpenAI’s GPT-4 in providing responses to high-value care clinical scenarios within internal medicine, assessing its accuracy, relevance, and reasoning against established medical guidelines and literature. Materials and Methods: An observational study was conducted using MKSAP-19’s high-value care questions, comparing GPT-4’s responses to the correct answers based on established studies, trials, and guidelines. The study did not involve real patient data, circumventing the need for Institutional Review Board (IRB) approval. Performance metrics focused on the accuracy, relevance, and consistency of GPT-4’s answers. Results: GPT-4 demonstrated a 74.4% accuracy rate (32 out of 43 questions) across a range of high-value care clinical scenarios, including image-based questions. Errors made by GPT-4 were similar to those by medical residents using MKSAP-19, suggesting areas for model improvement and potential educational applications. The study detailed GPT-4’s decision-making pattern, emphasizing its clinical reasoning capabilities. Conclusion: The findings suggest that GPT-4 can significantly support high-value care in internal medicine by providing accurate, evidence-based responses to complex clinical scenarios. Despite its limitations, including a 25.6% error rate and the scope of its training data, GPT-4’s performance indicates its potential as both a clinical and educational tool in healthcare.

Keywords