Clinical Ophthalmology (Dec 2024)

Impact of Demographic Modifiers on Readability of Myopia Education Materials Generated by Large Language Models

  • Lee GG,
  • Goodman D,
  • Chang TCP

Journal volume & issue
Vol. Volume 18
pp. 3591 – 3604

Abstract

Read online

Gabriela G Lee, Deniz Goodman, Ta Chen Peter Chang Department of Ophthalmology, Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, FL, USACorrespondence: Ta Chen Peter Chang, Department of Ophthalmology, Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, 900 NW 17th Street #450N, Miami, FL, 33136, USA, Tel +1 (305) 326-6400, Email [email protected]: The rise of large language models (LLM) promises to widely impact healthcare providers and patients alike. As these tools reflect the biases of currently available data on the internet, there is a risk that increasing LLM use will proliferate these biases and affect information quality. This study aims to characterize the effects of different race, ethnicity, and gender modifiers in question prompts presented to three large language models (LLM) on the length and readability of patient education materials about myopia.Methods: ChatGPT, Gemini, and Copilot were provided a standardized prompt incorporating demographic modifiers to inquire about myopia. The races and ethnicities evaluated were Asian, Black, Hispanic, Native American, and White. Gender was limited to male or female. The prompt was inserted five times into new chat windows. Responses were analyzed for readability by word count, Simple Measure of Gobbledygook (SMOG) index, Flesch-Kincaid Grade Level, and Flesch Reading Ease score. Significant differences were analyzed using two-way ANOVA on SPSS.Results: A total of 150 responses were analyzed. There were no differences in SMOG index, Flesch-Kincaid Grade Level, or Flesch Reading Ease scores between responses generated with prompts containing different gender, race, or ethnicity modifiers using ChatGPT or Copilot. Gemini-generated responses differed significantly in their SMOG Index, Flesch-Kincaid Grade Level, and Flesch Reading Ease based on the race mentioned in the prompt (p< 0.05).Conclusion: Patient demographic information impacts the reading level of educational material generated by Gemini but not by ChatGPT or Copilot. As patients use LLMs to understand ophthalmologic diagnoses like myopia, clinicians and users should be aware of demographic influences on readability. Patient gender, race, and ethnicity may be overlooked variables affecting the readability of LLM-generated education materials, which can impact patient care. Future research could focus on the accuracy of generated information to identify potential risks of misinformation.Keywords: health literacy, readability, large language models

Keywords