Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

Jenny Yang; Andrew A. S. Soltan; David A. Clifton

doi:10.1038/s41746-022-00614-9

npj Digital Medicine (Jun 2022)

Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

Jenny Yang,
Andrew A. S. Soltan,
David A. Clifton

Affiliations

Jenny Yang: Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford
Andrew A. S. Soltan: John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust
David A. Clifton: Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford

DOI: https://doi.org/10.1038/s41746-022-00614-9
Journal volume & issue: Vol. 5, no. 1
pp. 1 – 8

Abstract

Read online

Abstract As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.

Published in npj Digital Medicine

ISSN: 2398-6352 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.nature.com/npjdigitalmed/

About the journal