npj Digital Medicine (Mar 2025)
Red teaming ChatGPT in medicine to yield real-world insights on model behavior
- Crystal T. Chang,
- Hodan Farah,
- Haiwen Gui,
- Shawheen Justin Rezaei,
- Charbel Bou-Khalil,
- Ye-Jean Park,
- Akshay Swaminathan,
- Jesutofunmi A. Omiye,
- Akaash Kolluri,
- Akash Chaurasia,
- Alejandro Lozano,
- Alice Heiman,
- Allison Sihan Jia,
- Amit Kaushal,
- Angela Jia,
- Angelica Iacovelli,
- Archer Yang,
- Arghavan Salles,
- Arpita Singhal,
- Balasubramanian Narasimhan,
- Benjamin Belai,
- Benjamin H. Jacobson,
- Binglan Li,
- Celeste H. Poe,
- Chandan Sanghera,
- Chenming Zheng,
- Conor Messer,
- Damien Varid Kettud,
- Deven Pandya,
- Dhamanpreet Kaur,
- Diana Hla,
- Diba Dindoust,
- Dominik Moehrle,
- Duncan Ross,
- Ellaine Chou,
- Eric Lin,
- Fateme Nateghi Haredasht,
- Ge Cheng,
- Irena Gao,
- Jacob Chang,
- Jake Silberg,
- Jason A. Fries,
- Jiapeng Xu,
- Joe Jamison,
- John S. Tamaresis,
- Jonathan H. Chen,
- Joshua Lazaro,
- Juan M. Banda,
- Julie J. Lee,
- Karen Ebert Matthys,
- Kirsten R. Steffner,
- Lu Tian,
- Luca Pegolotti,
- Malathi Srinivasan,
- Maniragav Manimaran,
- Matthew Schwede,
- Minghe Zhang,
- Minh Nguyen,
- Mohsen Fathzadeh,
- Qian Zhao,
- Rika Bajra,
- Rohit Khurana,
- Ruhana Azam,
- Rush Bartlett,
- Sang T. Truong,
- Scott L. Fleming,
- Shriti Raj,
- Solveig Behr,
- Sonia Onyeka,
- Sri Muppidi,
- Tarek Bandali,
- Tiffany Y. Eulalio,
- Wenyuan Chen,
- Xuanyu Zhou,
- Yanan Ding,
- Ying Cui,
- Yuqi Tan,
- Yutong Liu,
- Nigam Shah,
- Roxana Daneshjou
Affiliations
- Crystal T. Chang
- Department of Dermatology, Stanford University
- Hodan Farah
- Department of Dermatology, Stanford University
- Haiwen Gui
- Department of Dermatology, Stanford University
- Shawheen Justin Rezaei
- School of Medicine, Stanford University
- Charbel Bou-Khalil
- School of Medicine, Stanford University
- Ye-Jean Park
- Temerty Faculty of Medicine
- Akshay Swaminathan
- School of Medicine, Stanford University
- Jesutofunmi A. Omiye
- Department of Dermatology, Stanford University
- Akaash Kolluri
- Stanford University
- Akash Chaurasia
- Department of Computer Science, Stanford University
- Alejandro Lozano
- Department of Biomedical Data Science, Stanford University
- Alice Heiman
- Stanford University
- Allison Sihan Jia
- Stanford University
- Amit Kaushal
- Department of Bioengineering, Stanford University
- Angela Jia
- Stanford University
- Angelica Iacovelli
- Department of Pediatrics, Stanford University
- Archer Yang
- Department of Biomedical Data Science, Stanford University
- Arghavan Salles
- Stanford University
- Arpita Singhal
- Department of Computer Science, Stanford University
- Balasubramanian Narasimhan
- Stanford University
- Benjamin Belai
- Department of Psychiatry, Stanford University
- Benjamin H. Jacobson
- School of Medicine, Stanford University
- Binglan Li
- Department of Biomedical Data Science, Stanford University
- Celeste H. Poe
- School of Medicine, Stanford University
- Chandan Sanghera
- Stanford University
- Chenming Zheng
- School of Medicine, Stanford University
- Conor Messer
- Stanford University
- Damien Varid Kettud
- Stanford University
- Deven Pandya
- Stanford University
- Dhamanpreet Kaur
- School of Medicine, Stanford University
- Diana Hla
- Mayo Clinic Alix School of Medicine
- Diba Dindoust
- Stanford University
- Dominik Moehrle
- School of Medicine, Stanford University
- Duncan Ross
- Department of Statistics, Stanford University
- Ellaine Chou
- Department of Biomedical Data Science, Stanford University
- Eric Lin
- Veterans Affairs Medical Center
- Fateme Nateghi Haredasht
- Center for Biomedical Informatics Research, Stanford University
- Ge Cheng
- Department of Biomedical Data Science, Stanford University
- Irena Gao
- Stanford University
- Jacob Chang
- Department of Biomedical Data Science, Stanford University
- Jake Silberg
- Department of Biomedical Data Science, Stanford University
- Jason A. Fries
- Center for Biomedical Informatics Research, Stanford University
- Jiapeng Xu
- Department of Biomedical Data Science, Stanford University
- Joe Jamison
- Department of Statistics, Stanford University
- John S. Tamaresis
- Department of Biomedical Data Science, Stanford University
- Jonathan H. Chen
- Clinical Excellence Research Center, School of Medicine, Stanford University
- Joshua Lazaro
- Department of Biomedical Data Science, Stanford University
- Juan M. Banda
- Technology and Digital Solutions, Stanford Health Care
- Julie J. Lee
- Department of Pediatrics, Stanford University
- Karen Ebert Matthys
- Department of Biomedical Data Science, Stanford University
- Kirsten R. Steffner
- Department of Anesthesiology, Stanford University
- Lu Tian
- Stanford University
- Luca Pegolotti
- Department of Pediatrics, Stanford University
- Malathi Srinivasan
- School of Medicine, Stanford University
- Maniragav Manimaran
- Graduate School of Business, Stanford University
- Matthew Schwede
- Department of Medicine, Stanford University
- Minghe Zhang
- Department of Statistics, Stanford University
- Minh Nguyen
- Stanford University
- Mohsen Fathzadeh
- Department of Epidemiology and Population Health, Stanford University
- Qian Zhao
- Department of Biomedical Data Science, Stanford University
- Rika Bajra
- School of Medicine, Stanford University
- Rohit Khurana
- Department of Biomedical Data Science, Stanford University
- Ruhana Azam
- Stanford University
- Rush Bartlett
- Stanford BioDesign, Stanford University
- Sang T. Truong
- Department of Computer Science, Stanford University
- Scott L. Fleming
- Department of Biomedical Data Science, Stanford University
- Shriti Raj
- Center for Biomedical Informatics Research, Stanford University
- Solveig Behr
- Department of Education and Psychology, Freie Universität Berlin
- Sonia Onyeka
- Department of Dermatology, Stanford University
- Sri Muppidi
- Stanford University
- Tarek Bandali
- Stanford University
- Tiffany Y. Eulalio
- Department of Biomedical Data Science, Stanford University
- Wenyuan Chen
- Department of Biomedical Data Science, Stanford University
- Xuanyu Zhou
- Department of Epidemiology and Population Health, Stanford University
- Yanan Ding
- Department of Biomedical Data Science, Stanford University
- Ying Cui
- Stanford University
- Yuqi Tan
- Department of Pathology, Stanford University
- Yutong Liu
- Department of Epidemiology and Population Health, Stanford University
- Nigam Shah
- School of Medicine, Stanford University
- Roxana Daneshjou
- Department of Dermatology, Stanford University
- DOI
- https://doi.org/10.1038/s41746-025-01542-0
- Journal volume & issue
-
Vol. 8,
no. 1
pp. 1 – 10
Abstract
Abstract Red teaming, the practice of adversarially exposing unexpected or undesired model behaviors, is critical towards improving equity and accuracy of large language models, but non-model creator-affiliated red teaming is scant in healthcare. We convened teams of clinicians, medical and engineering students, and technical professionals (80 participants total) to stress-test models with real-world clinical cases and categorize inappropriate responses along axes of safety, privacy, hallucinations/accuracy, and bias. Six medically-trained reviewers re-analyzed prompt-response pairs and added qualitative annotations. Of 376 unique prompts (1504 responses), 20.1% were inappropriate (GPT-3.5: 25.8%; GPT-4.0: 16%; GPT-4.0 with Internet: 17.8%). Subsequently, we show the utility of our benchmark by testing GPT-4o, a model released after our event (20.4% inappropriate). 21.5% of responses appropriate with GPT-3.5 were inappropriate in updated models. We share insights for constructing red teaming prompts, and present our benchmark for iterative model assessments.