Considerations in the reliability and fairness audits of predictive models for advance care planning

Jonathan Lu; Amelia Sattler; Samantha Wang; Ali Raza Khaki; Alison Callahan; Scott Fleming; Rebecca Fong; Benjamin Ehlert; Ron C. Li; Lisa Shieh; Kavitha Ramchandran; Michael F. Gensheimer; Sarah Chobot; Stephen Pfohl; Siyun Li; Kenny Shum; Nitin Parikh; Priya Desai; Briththa Seevaratnam; Melanie Hanson; Margaret Smith; Yizhe Xu; Arjun Gokhale; Steven Lin; Michael A. Pfeffer; Michael A. Pfeffer; Winifred Teuteberg; Nigam H. Shah; Nigam H. Shah; Nigam H. Shah

doi:10.3389/fdgth.2022.943768

Frontiers in Digital Health (Sep 2022)

Considerations in the reliability and fairness audits of predictive models for advance care planning

Jonathan Lu,
Amelia Sattler,
Samantha Wang,
Ali Raza Khaki,
Alison Callahan,
Scott Fleming,
Rebecca Fong,
Benjamin Ehlert,
Ron C. Li,
Lisa Shieh,
Kavitha Ramchandran,
Michael F. Gensheimer,
Sarah Chobot,
Stephen Pfohl,
Siyun Li,
Kenny Shum,
Nitin Parikh,
Priya Desai,
Briththa Seevaratnam,
Melanie Hanson,
Margaret Smith,
Yizhe Xu,
Arjun Gokhale,
Steven Lin,
Michael A. Pfeffer,
Michael A. Pfeffer,
Winifred Teuteberg,
Nigam H. Shah,
Nigam H. Shah,
Nigam H. Shah

Affiliations

Jonathan Lu: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Amelia Sattler: Stanford Healthcare AI Applied Research Team, Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Samantha Wang: Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Ali Raza Khaki: Division of Oncology, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Alison Callahan: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Scott Fleming: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Rebecca Fong: Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Benjamin Ehlert: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Ron C. Li: Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Lisa Shieh: Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Kavitha Ramchandran: Division of Oncology, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Michael F. Gensheimer: Department of Radiation Oncology, Stanford University School of Medicine, Palo Alto, United States
Sarah Chobot: Inpatient Palliative Care, Stanford Health Care, Palo Alto, United States
Stephen Pfohl: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Siyun Li: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Kenny Shum: Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States
Nitin Parikh: Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States
Priya Desai: Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States
Briththa Seevaratnam: Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Melanie Hanson: Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Margaret Smith: Stanford Healthcare AI Applied Research Team, Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Yizhe Xu: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Arjun Gokhale: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Steven Lin: Stanford Healthcare AI Applied Research Team, Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Michael A. Pfeffer: Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Michael A. Pfeffer: Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States
Winifred Teuteberg: Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Nigam H. Shah: Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States
Nigam H. Shah: Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States
Nigam H. Shah: Clinical Excellence Research Center, Stanford University School of Medicine, Palo Alto, United States

DOI: https://doi.org/10.3389/fdgth.2022.943768
Journal volume & issue: Vol. 4

Abstract

Read online

Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question (“Would you be surprised if [patient X] passed away in [Y years]?”) as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as “Other.” 10 clinicians were surveyed after a presentation summarizing the audit. 10/10 reported that summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care; 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8–10 months. Our recommendations for performing reliability and fairness audits include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for label generation by clinicians. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders.

Published in Frontiers in Digital Health

ISSN: 2673-253X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Public aspects of medicine; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/digital-health#

About the journal

Abstract

Keywords