Comparison of deep learning with traditional models to predict preventable acute care use and spending among heart failure patients

Maor Lewis; Guy Elad; Moran Beladev; Gal Maor; Kira Radinsky; Dor Hermann; Yoav Litani; Tal Geller; Jesse M. Pines; Nathan l. Shapiro; Jose F. Figueroa

doi:10.1038/s41598-020-80856-3

Scientific Reports (Jan 2021)

Comparison of deep learning with traditional models to predict preventable acute care use and spending among heart failure patients

Maor Lewis,
Guy Elad,
Moran Beladev,
Gal Maor,
Kira Radinsky,
Dor Hermann,
Yoav Litani,
Tal Geller,
Jesse M. Pines,
Nathan l. Shapiro,
Jose F. Figueroa

Affiliations

Maor Lewis: Diagnostic Robotics Inc.
Guy Elad: Diagnostic Robotics Inc.
Moran Beladev: Diagnostic Robotics Inc.
Gal Maor: Diagnostic Robotics Inc.
Kira Radinsky: Diagnostic Robotics Inc.
Dor Hermann: Diagnostic Robotics Inc.
Yoav Litani: Diagnostic Robotics Inc.
Tal Geller: Diagnostic Robotics Inc.
Jesse M. Pines: Diagnostic Robotics Inc.
Nathan l. Shapiro: Diagnostic Robotics Inc.
Jose F. Figueroa: Department of Health Policy and Management, Harvard T.H. Chan School of Public Health

DOI: https://doi.org/10.1038/s41598-020-80856-3
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Recent health reforms have created incentives for cardiologists and accountable care organizations to participate in value-based care models for heart failure (HF). Accurate risk stratification of HF patients is critical to efficiently deploy interventions aimed at reducing preventable utilization. The goal of this paper was to compare deep learning approaches with traditional logistic regression (LR) to predict preventable utilization among HF patients. We conducted a prognostic study using data on 93,260 HF patients continuously enrolled for 2-years in a large U.S. commercial insurer to develop and validate prediction models for three outcomes of interest: preventable hospitalizations, preventable emergency department (ED) visits, and preventable costs. Patients were split into training, validation, and testing samples. Outcomes were modeled using traditional and enhanced LR and compared to gradient boosting model and deep learning models using sequential and non-sequential inputs. Evaluation metrics included precision (positive predictive value) at k, cost capture, and Area Under the Receiver operating characteristic (AUROC). Deep learning models consistently outperformed LR for all three outcomes with respect to the chosen evaluation metrics. Precision at 1% for preventable hospitalizations was 43% for deep learning compared to 30% for enhanced LR. Precision at 1% for preventable ED visits was 39% for deep learning compared to 33% for enhanced LR. For preventable cost, cost capture at 1% was 30% for sequential deep learning, compared to 18% for enhanced LR. The highest AUROCs for deep learning were 0.778, 0.681 and 0.727, respectively. These results offer a promising approach to identify patients for targeted interventions.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal