AJOG Global Reports (Nov 2024)
Artificial intelligence generates proficient Spanish obstetrics and gynecology counseling templatesAJOG Global Reports at a Glance
Abstract
Background: Effective patient counseling in Obstetrics and gynecology is vital. Existing language barriers between Spanish-speaking patients and English-speaking providers may negatively impact patient understanding and adherence to medical recommendations, as language discordance between provider and patient has been associated with medication noncompliance, adverse drug events, and underuse of preventative care. Artificial intelligence large language models may be a helpful adjunct to patient care by generating counseling templates in Spanish. Objectives: The primary objective was to determine if large language models can generate proficient counseling templates in Spanish on obstetric and gynecology topics. Secondary objectives were to (1) compare the content, quality, and comprehensiveness of generated templates between different large language models, (2) compare the proficiency ratings among the large language model generated templates, and (3) assess which generated templates had potential for integration into clinical practice. Study design: Cross-sectional study using free open-access large language models to generate counseling templates in Spanish on select obstetrics and gynecology topics. Native Spanish-speaking practicing obstetricians and gynecologists, who were blinded to the source large language model for each template, reviewed and subjectively scored each template on its content, quality, and comprehensiveness and considered it for integration into clinical practice. Proficiency ratings were calculated as a composite score of content, quality, and comprehensiveness. A score of >4 was considered proficient. Basic inferential statistics were performed. Results: All artificial intelligence large language models generated proficient obstetrics and gynecology counseling templates in Spanish, with Google Bard generating the most proficient template (p<0.0001) and outperforming the others in comprehensiveness (P=.03), quality (P=.04), and content (P=.01). Microsoft Bing received the lowest scores in these domains. Physicians were likely to be willing to incorporate the templates into clinical practice, with no significant discrepancy in the likelihood of integration based on the source large language model (P=.45). Conclusions: Large language models have potential to generate proficient obstetrics and gynecology counseling templates in Spanish, which physicians would integrate into their clinical practice. Google Bard scored the highest across all attributes. There is an opportunity to use large language models to try to mitigate the language barriers in health care. Future studies should assess patient satisfaction, understanding, and adherence to clinical plans following receipt of these counseling templates.