Digital Health (Dec 2023)

Feasibility and acceptability of ChatGPT generated radiology report summaries for cancer patients

  • Eric M Chung,
  • Samuel C Zhang,
  • Anthony T Nguyen,
  • Katelyn M Atkins,
  • Howard M Sandler,
  • Mitchell Kamrava

DOI
https://doi.org/10.1177/20552076231221620
Journal volume & issue
Vol. 9

Abstract

Read online

Objective Patients now have direct access to their radiology reports, which can include complex terminology and be difficult to understand. We assessed ChatGPT's ability to generate summarized MRI reports for patients with prostate cancer and evaluated physician satisfaction with the artificial intelligence (AI)-summarized report. Methods We used ChatGPT to summarize five full MRI reports for patients with prostate cancer performed at a single institution from 2021 to 2022. Three summarized reports were generated for each full MRI report. Full MRI and summarized reports were assessed for readability using Flesch-Kincaid Grade Level (FK) score. Radiation oncologists were asked to evaluate the AI-summarized reports via an anonymous questionnaire. Qualitative responses were given on a 1–5 Likert-type scale. Fifty newly diagnosed prostate cancer patient MRIs performed at a single institution were additionally assessed for physician online portal response rates. Results Fifteen summarized reports were generated from five full MRI reports using ChatGPT. The median FK score for the full MRI reports and summarized reports was 9.6 vs. 5.0, ( p < 0.05), respectively. Twelve radiation oncologists responded to our questionnaire. The mean [SD] ratings for summarized reports were factual correctness (4.0 [0.6], understanding 4.0 [0.7]), completeness (4.1 [0.5]), potential for harm (3.5 [0.9]), overall quality (3.4 [0.9]), and likelihood to send to patient (3.1 [1.1]). Current physician online portal response rates were 14/50 (28%) at our institution. Conclusions We demonstrate a novel application of ChatGPT to summarize MRI reports at a reading level appropriate for patients. Physicians were likely to be satisfied with the summarized reports with respect to factual correctness, ease of understanding, and completeness. Physicians were less likely to be satisfied with respect to potential for harm, overall quality, and likelihood to send to patients. Further research is needed to optimize ChatGPT's ability to summarize radiology reports and understand what factors influence physician trust in AI-summarized reports.