Evaluation of information provided to patients 
by ChatGPT about chronic diseases in 
Spanish language

María Juliana Soto-Chávez; Marlon Mauricio Bustos; Daniel G. Fernández-Ávila; Oscar Mauricio Muñoz

doi:10.1177/20552076231224603

Digital Health (Jan 2024)

Evaluation of information provided to patients by ChatGPT about chronic diseases in Spanish language

María Juliana Soto-Chávez,
Marlon Mauricio Bustos,
Daniel G. Fernández-Ávila,
Oscar Mauricio Muñoz

Affiliations

María Juliana Soto-Chávez: Department of Internal Medicine, , Bogotá, Colombia
Marlon Mauricio Bustos: Department of Internal Medicine, , Bogotá, Colombia
Daniel G. Fernández-Ávila: Rheumatology Unit, , Bogotá, Colombia
Oscar Mauricio Muñoz: Department of Internal Medicine, , Bogotá, Colombia

DOI: https://doi.org/10.1177/20552076231224603
Journal volume & issue: Vol. 10

Abstract

Read online

Introduction Artificial intelligence has presented exponential growth in medicine. The ChatGPT language model has been highlighted as a possible source of patient information. This study evaluates the reliability and readability of ChatGPT-generated patient information on chronic diseases in Spanish. Methods Questions frequently asked by patients on the internet about diabetes mellitus, heart failure, rheumatoid arthritis (RA), chronic kidney disease (CKD), and systemic lupus erythematosus (SLE) were submitted to ChatGPT. Reliability was assessed by rating responses as (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, (4) completely incorrect, and divided between “good” (1 and 2) and “bad” (3 and 4). Readability was evaluated with the adapted Flesch and Szigriszt formulas. Results And 71.67% of the answers were “good,” with none qualified as “completely incorrect.” Better reliability was observed in questions on diabetes and RA versus heart failure (p = 0.02). In readability, responses were “moderately difficult” (54.73, interquartile range (IQR) 51.59–58.58), with better results for CKD (median 56.1, IQR 53.5–59.1) and RA (56.4, IQR 53.7–60.7), than for heart failure responses (median 50.6, IQR 46.3–53.8). Conclusion Our study suggests that the ChatGPT tool can be a reliable source of information in spanish for patients with chronic diseases with different reliability for some of them, however, it needs to improve the readability of its answers to be recommended as a useful tool for patients.

Published in Digital Health

ISSN: 2055-2076 (Online)
Publisher: SAGE Publishing
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://journals.sagepub.com/home/dhj

About the journal