Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach

Vaid, Akhil; Jaladanki, Suraj K; Xu, Jie; Teng, Shelly; Kumar, Arvind; Lee, Samuel; Somani, Sulaiman; Paranjpe, Ishan; De Freitas, Jessica K; Wanyan, Tingyi; Johnson, Kipp W; Bicak, Mesude; Klang, Eyal; Kwon, Young Joon; Costa, Anthony; Zhao, Shan; Miotto, Riccardo; Charney, Alexander W; Böttinger, Erwin; Fayad, Zahi A; Nadkarni, Girish N; Wang, Fei; Glicksberg, Benjamin S

doi:10.2196/24207

JMIR Medical Informatics (Jan 2021)

Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach

Vaid, Akhil,
Jaladanki, Suraj K,
Xu, Jie,
Teng, Shelly,
Kumar, Arvind,
Lee, Samuel,
Somani, Sulaiman,
Paranjpe, Ishan,
De Freitas, Jessica K,
Wanyan, Tingyi,
Johnson, Kipp W,
Bicak, Mesude,
Klang, Eyal,
Kwon, Young Joon,
Costa, Anthony,
Zhao, Shan,
Miotto, Riccardo,
Charney, Alexander W,
Böttinger, Erwin,
Fayad, Zahi A,
Nadkarni, Girish N,
Wang, Fei,
Glicksberg, Benjamin S

Affiliations

Vaid, Akhil
Jaladanki, Suraj K
Xu, Jie
Teng, Shelly
Kumar, Arvind
Lee, Samuel
Somani, Sulaiman
Paranjpe, Ishan
De Freitas, Jessica K
Wanyan, Tingyi
Johnson, Kipp W
Bicak, Mesude
Klang, Eyal
Kwon, Young Joon
Costa, Anthony
Zhao, Shan
Miotto, Riccardo
Charney, Alexander W
Böttinger, Erwin
Fayad, Zahi A
Nadkarni, Girish N
Wang, Fei
Glicksberg, Benjamin S

DOI: https://doi.org/10.2196/24207
Journal volume & issue: Vol. 9, no. 1
p. e24207

Abstract

Read online

BackgroundMachine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. ObjectiveWe aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. MethodsPatient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. ResultsThe LASSOfederated model outperformed the LASSOlocal model at 3 hospitals, and the MLPfederated model performed better than the MLPlocal model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSOpooled model outperformed the LASSOfederated model at all hospitals, and the MLPfederated model outperformed the MLPpooled model at 2 hospitals. ConclusionsThe federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy.

Published in JMIR Medical Informatics

ISSN: 2291-9694 (Online)
Publisher: JMIR Publications
Country of publisher: Canada
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://medinform.jmir.org

About the journal