PLoS ONE (Jan 2022)

A 12-hospital prospective evaluation of a clinical decision support prognostic algorithm based on logistic regression as a form of machine learning to facilitate decision making for patients with suspected COVID-19

  • Monica I. Lupei,
  • Danni Li,
  • Nicholas E. Ingraham,
  • Karyn D. Baum,
  • Bradley Benson,
  • Michael Puskarich,
  • David Milbrandt,
  • Genevieve B. Melton,
  • Daren Scheppmann,
  • Michael G. Usher,
  • Christopher J. Tignanelli

Journal volume & issue
Vol. 17, no. 1

Abstract

Read online

Objective To prospectively evaluate a logistic regression-based machine learning (ML) prognostic algorithm implemented in real-time as a clinical decision support (CDS) system for symptomatic persons under investigation (PUI) for Coronavirus disease 2019 (COVID-19) in the emergency department (ED). Methods We developed in a 12-hospital system a model using training and validation followed by a real-time assessment. The LASSO guided feature selection included demographics, comorbidities, home medications, vital signs. We constructed a logistic regression-based ML algorithm to predict “severe” COVID-19, defined as patients requiring intensive care unit (ICU) admission, invasive mechanical ventilation, or died in or out-of-hospital. Training data included 1,469 adult patients who tested positive for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) within 14 days of acute care. We performed: 1) temporal validation in 414 SARS-CoV-2 positive patients, 2) validation in a PUI set of 13,271 patients with symptomatic SARS-CoV-2 test during an acute care visit, and 3) real-time validation in 2,174 ED patients with PUI test or positive SARS-CoV-2 result. Subgroup analysis was conducted across race and gender to ensure equity in performance. Results The algorithm performed well on pre-implementation validations for predicting COVID-19 severity: 1) the temporal validation had an area under the receiver operating characteristic (AUROC) of 0.87 (95%-CI: 0.83, 0.91); 2) validation in the PUI population had an AUROC of 0.82 (95%-CI: 0.81, 0.83). The ED CDS system performed well in real-time with an AUROC of 0.85 (95%-CI, 0.83, 0.87). Zero patients in the lowest quintile developed “severe” COVID-19. Patients in the highest quintile developed “severe” COVID-19 in 33.2% of cases. The models performed without significant differences between genders and among race/ethnicities (all p-values > 0.05). Conclusion A logistic regression model-based ML-enabled CDS can be developed, validated, and implemented with high performance across multiple hospitals while being equitable and maintaining performance in real-time validation.