A Unified Framework on Generalizability of Clinical Prediction Models

Bohua Wan; Brian Caffo; Brian Caffo; S. Swaroop Vedula

doi:10.3389/frai.2022.872720

Frontiers in Artificial Intelligence (Apr 2022)

A Unified Framework on Generalizability of Clinical Prediction Models

Bohua Wan,
Brian Caffo,
Brian Caffo,
S. Swaroop Vedula

Affiliations

Bohua Wan: Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States
Brian Caffo: Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
Brian Caffo: Malone Center for Engineering in Healthcare, Whiting School of Engineering, Baltimore, MD, United States
S. Swaroop Vedula: Malone Center for Engineering in Healthcare, Whiting School of Engineering, Baltimore, MD, United States

DOI: https://doi.org/10.3389/frai.2022.872720
Journal volume & issue: Vol. 5

Abstract

Read online

To be useful, clinical prediction models (CPMs) must be generalizable to patients in new settings. Evaluating generalizability of CPMs helps identify spurious relationships in data, provides insights on when they fail, and thus, improves the explainability of the CPMs. There are discontinuities in concepts related to generalizability of CPMs in the clinical research and machine learning domains. Specifically, conventional statistical reasons to explain poor generalizability such as inadequate model development for the purposes of generalizability, differences in coding of predictors and outcome between development and external datasets, measurement error, inability to measure some predictors, and missing data, all have differing and often complementary treatments, in the two domains. Much of the current machine learning literature on generalizability of CPMs is in terms of dataset shift of which several types have been described. However, little research exists to synthesize concepts in the two domains. Bridging this conceptual discontinuity in the context of CPMs can facilitate systematic development of CPMs and evaluation of their sensitivity to factors that affect generalizability. We survey generalizability and dataset shift in CPMs from both the clinical research and machine learning perspectives, and describe a unifying framework to analyze generalizability of CPMs and to explain their sensitivity to factors affecting it. Our framework leads to a set of signaling statements that can be used to characterize differences between datasets in terms of factors that affect generalizability of the CPMs.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords