Laryngoscope Investigative Otolaryngology (Feb 2024)
Can ChatGPT help patients answer their otolaryngology questions?
Abstract
Abstract Background Over the past year, the world has been captivated by the potential of artificial intelligence (AI). The appetite for AI in science, specifically healthcare is huge. It is imperative to understand the credibility of large language models in assisting the public in medical queries. Objective To evaluate the ability of ChatGPT to provide reasonably accurate answers to public queries within the domain of Otolaryngology. Methods Two board‐certified otolaryngologists (HZ, RS) inputted 30 text‐based patient queries into the ChatGPT‐3.5 model. ChatGPT responses were rated by physicians on a scale (accurate, partially accurate, incorrect), while a similar 3‐point scale involving confidence was given to layperson reviewers. Demographic data involving gender and education level was recorded for the public reviewers. Inter‐rater agreement percentage was based on binomial distribution for calculating the 95% confidence intervals and performing significance tests. Statistical significance was defined as p < .05 for two‐sided tests. Results In testing patient queries, both Otolaryngology physicians found that ChatGPT answered 98.3% of questions correctly, but only 79.8% (range 51.7%–100%) of patients were confident that the AI model was accurate in its responses (corrected agreement = 0.682; p < .001). Among the layperson responses, the corrected coefficient was of moderate agreement (0.571; p < .001). No correlation was noted among age, gender, or education level for the layperson responses. Conclusion ChatGPT is highly accurate in responding to questions posed by the public with regards to Otolaryngology from a physician standpoint. Public reviewers were not fully confident in believing the AI model, with subjective concerns related to less trust in AI answers compared to physician explanation. Larger evaluations with a representative public sample and broader medical questions should immediately be conducted by appropriate organizations, governing bodies, and/or governmental agencies to instill public confidence in AI and ChatGPT as a medical resource. Level of Evidence 4.
Keywords