Journal of Patient Experience (Jul 2024)

Patient-Readable Radiology Report Summaries Generated via Large Language Model: Safety and Quality

  • Nicholas W. Sterling MD, PhD,
  • Felix Brann BSc,
  • Stephanie O. Frisch PhD, RN,
  • Justin D. Schrager MD, MPH

DOI
https://doi.org/10.1177/23743735241259477
Journal volume & issue
Vol. 11

Abstract

Read online

Complex medical terminology utilized in clinical documentation can present barriers to patients understanding their medical findings. We aimed to generate easy-to-understand summaries of clinical radiology reports using large language models (LLMs) and evaluate their safety and quality. Eight board-certified physician reviewers evaluated 1982 LLM-generated radiology report summaries (computed tomography, magnetic resonance imaging, ultrasound, and x-ray) for safety and quality, using predefined rating criteria and the corresponding original radiology reports for reference. Physician reviewers determined 99.2% (1967 out of 1982) of the LLM-generated summaries to be safe. The reviewers scored the quality of the LLM-generated summaries from “5—Very Good” to “1—Very Poor,” respectively, as follows: 80.6%, 11.1%, 5.7%, 1.7%, and 0.9%. Safety varied significantly across imaging modality ( P = .002). Large language models can be used to generate safe and high-quality summaries of clinical radiology reports. Further investigation is warranted to determine the impact of LLM-generated summaries on patient perception of understanding, knowledge of their medical conditions, and overall experience.