Scientific Reports (Jul 2024)

SHAP based predictive modeling for 1 year all-cause readmission risk in elderly heart failure patients: feature selection and model interpretation

  • Hao Luo,
  • Congyu Xiang,
  • Lang Zeng,
  • Shikang Li,
  • Xue Mei,
  • Lijuan Xiong,
  • Yanxu Liu,
  • Cong Wen,
  • Yangyang Cui,
  • Linqin Du,
  • Yang Zhou,
  • Kun Wang,
  • Lan Li,
  • Zonglian Liu,
  • Qi Wu,
  • Jun Pu,
  • Rongchuan Yue

DOI
https://doi.org/10.1038/s41598-024-67844-7
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Heart failure (HF) is a significant global public health concern with a high readmission rate, posing a serious threat to the health of the elderly population. While several studies have used machine learning (ML) to develop all-cause readmission risk prediction models for elderly patients with HF, few have integrated ML-selected features with those chosen by human experts to assess HF patients readmission. A retrospective analysis of 8396 elderly HF patients hospitalized at the Affiliated Hospital of North Sichuan Medical College from January 1, 2018 to December 31, 2021 was conducted. Variables selected by XGBoost, LASSO regression, and random forest constituted the machine group, while the human expert group comprised variables chosen by two experienced cardiovascular professors. The variables selected by both groups were combined to form a human–machine collaboration group. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC). The SHapley Additive exPlanations (SHAP) method was used to elucidate the importance of each predictive feature, explain the impact of individual features on the model, and provide visual representation. A total of 73 features were included for model development. The human–machine collaboration model, utilizing CatBoost, achieved an AUC of 0.83617, an F1-score of 0.73521, and a Brier score of 0.16536 on the validation set. This model demonstrated superior predictive performance compared to those created solely by human experts or machine. The SHAP plot was then used to visually display the feature analysis of the human–machine collaboration model, revealing HGB, NT-proBNP, smoking history, NYHA classification, and LVEF as the 5 most important features. This study indicate that the human–machine collaboration model outperforms those relying solely on human expert selection or machine algorithm at predicting all-cause readmission in elderly HF patients. The application of the SHAP method enhanced the interpretability of the model outcomes, aiding clinicians in accurately pinpointing risk factors associated with HF readmission. This advancement enables the formulation of tailored treatment strategies, offering a more personalized approach to patient care.

Keywords