Human visual explanations mitigate bias in AI-based assessment of surgeon skills

Dani Kiyasseh; Jasper Laca; Taseen F. Haque; Maxwell Otiato; Brian J. Miles; Christian Wagner; Daniel A. Donoho; Quoc-Dien Trinh; Animashree Anandkumar; Andrew J. Hung

doi:10.1038/s41746-023-00766-2

npj Digital Medicine (Mar 2023)

Human visual explanations mitigate bias in AI-based assessment of surgeon skills

Dani Kiyasseh,
Jasper Laca,
Taseen F. Haque,
Maxwell Otiato,
Brian J. Miles,
Christian Wagner,
Daniel A. Donoho,
Quoc-Dien Trinh,
Animashree Anandkumar,
Andrew J. Hung

Affiliations

Dani Kiyasseh: Department of Computing and Mathematical Sciences, California Institute of Technology
Jasper Laca: Center for Robotic Simulation and Education, Catherine & Joseph Aresty Department of Urology, University of Southern California
Taseen F. Haque: Center for Robotic Simulation and Education, Catherine & Joseph Aresty Department of Urology, University of Southern California
Maxwell Otiato: Center for Robotic Simulation and Education, Catherine & Joseph Aresty Department of Urology, University of Southern California
Brian J. Miles: Department of Urology, Houston Methodist Hospital
Christian Wagner: Department of Urology, Pediatric Urology and Uro-Oncology, Prostate Center Northwest, St. Antonius-Hospital
Daniel A. Donoho: Division of Neurosurgery, Center for Neuroscience, Children’s National Hospital
Quoc-Dien Trinh: Center for Surgery & Public Health, Department of Surgery, Brigham and Women’s Hospital, Harvard Medical School
Animashree Anandkumar: Department of Computing and Mathematical Sciences, California Institute of Technology
Andrew J. Hung: Center for Robotic Simulation and Education, Catherine & Joseph Aresty Department of Urology, University of Southern California

DOI: https://doi.org/10.1038/s41746-023-00766-2
Journal volume & issue: Vol. 6, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Artificial intelligence (AI) systems can now reliably assess surgeon skills through videos of intraoperative surgical activity. With such systems informing future high-stakes decisions such as whether to credential surgeons and grant them the privilege to operate on patients, it is critical that they treat all surgeons fairly. However, it remains an open question whether surgical AI systems exhibit bias against surgeon sub-cohorts, and, if so, whether such bias can be mitigated. Here, we examine and mitigate the bias exhibited by a family of surgical AI systems—SAIS—deployed on videos of robotic surgeries from three geographically-diverse hospitals (USA and EU). We show that SAIS exhibits an underskilling bias, erroneously downgrading surgical performance, and an overskilling bias, erroneously upgrading surgical performance, at different rates across surgeon sub-cohorts. To mitigate such bias, we leverage a strategy —TWIX—which teaches an AI system to provide a visual explanation for its skill assessment that otherwise would have been provided by human experts. We show that whereas baseline strategies inconsistently mitigate algorithmic bias, TWIX can effectively mitigate the underskilling and overskilling bias while simultaneously improving the performance of these AI systems across hospitals. We discovered that these findings carry over to the training environment where we assess medical students’ skills today. Our study is a critical prerequisite to the eventual implementation of AI-augmented global surgeon credentialing programs, ensuring that all surgeons are treated fairly.

Published in npj Digital Medicine

ISSN: 2398-6352 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.nature.com/npjdigitalmed/

About the journal