BMC Medical Informatics and Decision Making (May 2023)

Risk prediction of heart failure in patients with ischemic heart disease using network analytics and stacking ensemble learning

  • Dejia Zhou,
  • Hang Qiu,
  • Liya Wang,
  • Minghui Shen

DOI
https://doi.org/10.1186/s12911-023-02196-2
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Heart failure (HF) is a major complication following ischemic heart disease (IHD) and it adversely affects the outcome. Early prediction of HF risk in patients with IHD is beneficial for timely intervention and for reducing disease burden. Methods Two cohorts, cases for patients first diagnosed with IHD and then with HF (N = 11,862) and control IHD patients without HF (N = 25,652), were established from the hospital discharge records in Sichuan, China during 2015-2019. Directed personal disease network (PDN) was constructed for each patient, and then these PDNs were merged to generate the baseline disease network (BDN) for the two cohorts, respectively, which identifies the health trajectories of patients and the complex progression patterns. The differences between the BDNs of the two cohort was represented as disease-specific network (DSN). Three novel network features were exacted from PDN and DSN to represent the similarity of disease patterns and specificity trends from IHD to HF. A stacking-based ensemble model DXLR was proposed to predict HF risk in IHD patients using the novel network features and basic demographic features (i.e., age and sex). The Shapley Addictive exPlanations method was applied to analyze the feature importance of the DXLR model. Results Compared with the six traditional machine learning models, our DXLR model exhibited the highest AUC (0.934 ± 0.004), accuracy (0.857 ± 0.007), precision (0.723 ± 0.014), recall (0.892 ± 0.012) and F1 score (0.798 ± 0.010). The feature importance showed that the novel network features ranked as the top three features, playing a notable role in predicting HF risk of IHD patient. The feature comparison experiment also indicated that our novel network features were superior to those proposed by the state-of-the-art study in improving the performance of the prediction model, with an increase in AUC by 19.9%, in accuracy by 18.7%, in precision by 30.7%, in recall by 37.4%, and in F1 score by 33.7%. Conclusions Our proposed approach that combines network analytics and ensemble learning effectively predicts HF risk in patients with IHD. This highlights the potential value of network-based machine learning in disease risk prediction field using administrative data.

Keywords