IEEE Access (Jan 2020)

A Stacking Ensemble Model to Predict Daily Number of Hospital Admissions for Cardiovascular Diseases

  • Zhixu Hu,
  • Hang Qiu,
  • Ziqi Su,
  • Minghui Shen,
  • Ziyu Chen

DOI
https://doi.org/10.1109/ACCESS.2020.3012143
Journal volume & issue
Vol. 8
pp. 138719 – 138729

Abstract

Read online

With lifestyle and environmental changes, the prevalence of cardiovascular diseases (CVDs) is trending upwards, putting pressure on the limited medical resources. Accurate forecasting of daily counts of hospital admissions (HAs) for CVDs is helpful to optimize medical resources. In this study, we proposed a stacking ensemble model with direct prediction strategy to predict the daily number of CVDs admissions using HAs data, air pollution data, and meteorological data. The sequential forward floating selection method with early stopping was applied for feature selection. Five machine learning models, including linear regression (LR), support vector regression (SVR), extreme gradient boosting (XGBoost), random forest (RF), and gradient boosting decision tree (GBDT), were utilized as base learners to construct the stacking model. We compared the performance of the proposed stacking model with the five base learners in three datasets. The experimental results indicated that our model performed best in three datasets under four evaluation criteria, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination (R2). Particularly, in the CVDs dataset, the MAPE is 15.103 for LR, 11.862 for SVR, 10.571 for XGBoost, 10.378 for GBDT, 10.333 for RF, and 9.679 for the stacking model. Compared with the best base learner RF, the MAPE, RMSE, and MAE of the stacking model decreased by 6.3%, 7.4%, and 6.3%, respectively, and the R2 improved by 1.7%. It is evident that the proposed stacking model can effectively forecast the daily number of hospitalizations for CVDs and provide decision support for hospital managers.

Keywords