Can Artificial Intelligence Fool Residency Selection Committees? Analysis of Personal Statements by Real Applicants and Generative AI, a Randomized, Single-Blind Multicenter Study

Zachary C. Lum, DO; Lohitha Guntupalli, BS; Augustine M. Saiz, MD; Holly Leshikar, MD; Hai V. Le, MD; John P. Meehan, MD; Eric G. Huish, DO

doi:10.2106/JBJS.OA.24.00028

JBJS Open Access (Dec 2024)

Can Artificial Intelligence Fool Residency Selection Committees? Analysis of Personal Statements by Real Applicants and Generative AI, a Randomized, Single-Blind Multicenter Study

Zachary C. Lum, DO,
Lohitha Guntupalli, BS,
Augustine M. Saiz, MD,
Holly Leshikar, MD,
Hai V. Le, MD,
John P. Meehan, MD,
Eric G. Huish, DO

Affiliations

Zachary C. Lum, DO: 1 Department of Surgery, Kiran Patel School of Osteopathic and Allopathic Medicine, Nova Southeastern University, Davie, Florida
Lohitha Guntupalli, BS: 1 Department of Surgery, Kiran Patel School of Osteopathic and Allopathic Medicine, Nova Southeastern University, Davie, Florida
Augustine M. Saiz, MD: 2 Department of Orthopaedic Surgery, School of Medicine, University of California: Davis, Sacramento, California
Holly Leshikar, MD: 2 Department of Orthopaedic Surgery, School of Medicine, University of California: Davis, Sacramento, California
Hai V. Le, MD: 2 Department of Orthopaedic Surgery, School of Medicine, University of California: Davis, Sacramento, California
John P. Meehan, MD: 2 Department of Orthopaedic Surgery, School of Medicine, University of California: Davis, Sacramento, California
Eric G. Huish, DO: 3 San Joaquin General Hospital, French Camp, California

DOI: https://doi.org/10.2106/JBJS.OA.24.00028
Journal volume & issue: Vol. 9, no. 4

Abstract

Read online

Introduction:. The potential capabilities of generative artificial intelligence (AI) tools have been relatively unexplored, particularly in the realm of creating personalized statements for medical students applying to residencies. This study aimed to investigate the ability of generative AI, specifically ChatGPT and Google BARD, to generate personal statements and assess whether faculty on residency selection committees could (1) evaluate differences between real and AI statements and (2) determine differences based on 13 defined and specific metrics of a personal statement. Methods:. Fifteen real personal statements were used to generate 15 unique and distinct personal statements from ChatGPT and BARD each, resulting in a total of 45 statements. Statements were then randomized, blinded, and presented to a group of faculty reviewers on residency selection committees. Reviewers assessed the statements by 14 metrics including if the personal statement was AI-generated or real. Comparison of all metrics was performed. Results:. Faculty correctly identified 88% (79/90) real statements, 90% (81/90) BARD, and 44% (40/90) ChatGPT statements. Accuracy of identifying real and BARD statements was 89%, but this dropped to 74% when including ChatGPT. In addition, the accuracy did not increase as faculty members reviewed more personal statements (area under the curve [AUC] 0.498, p = 0.966). BARD performed poorer than both real and ChatGPT across all metrics (p < 0.001). Comparing real with ChatGPT, there was no difference in most metrics, except for Personal Interests, Reasons for Choosing Residency, Career Goals, Compelling Nature and Originality, and all favoring the real personal statements (p = 0.001, p = 0.002, p < 0.001, p < 0.001, and p < 0.001, respectively). Conclusion:. Faculty members accurately identified real and BARD statements, but ChatGPT deceived them 56% of the time. Although AI can craft convincing statements that are sometimes indistinguishable from real ones, replicating the humanistic experience, personal nuances, and individualistic elements found in real personal statements is difficult. Residency selection committees might want to prioritize these particular metrics while assessing personal statements, given the growing capabilities of AI in this arena. Clinical Relevance:. Residency selection committees may want to prioritize certain metrics unique to the human element such as personal interests, reasons for choosing residency, career goals, compelling nature, and originality when evaluating personal statements.

Published in JBJS Open Access

ISSN: 2472-7245 (Online)
Publisher: Wolters Kluwer
Country of publisher: United States
LCC subjects: Medicine: Surgery: Orthopedic surgery
Website: http://journals.lww.com/jbjsoa

About the journal