IEEE Access (Jan 2025)

Meta-Ensemble Learning for Heart Disease Prediction: A Stacking-Based Approach With Explainable AI

  • Mehwish Naz,
  • Aqsa Khalid,
  • Abdul Hameed,
  • Rabia Taj,
  • Waleed Mumtaz,
  • Faiz Abdullah Alotaibi,
  • Mrim M. Alnfiai

DOI
https://doi.org/10.1109/access.2025.3588683
Journal volume & issue
Vol. 13
pp. 137271 – 137290

Abstract

Read online

Heart disease remains a leading cause of mortality globally, necessitating the development of accurate, interpretable, and reliable predictive models to support early diagnosis and enhance patient care. The work demonstrates a robust machine learning framework for classifying heart disease on three widely used benchmark datasets for heart disease such as Heart_2020_Cleaned, Heart Statlog Cleveland Hungary, and Cardio Train. A comparative analysis was conducted across these datasets to evaluate model performance under varying data characteristics. Similarly, to ensure optimal model performance and generalizability, k-fold cross-validation was employed for hyperparameter tuning, enabling systematic evaluation and fine-tuning of model parameters while reducing the risk of overfitting. The proposed model attained an accuracy of 97.72% with a F1-Score of 97.89% on the Heart_2020_Cleaned dataset comprising 319,795 samples, 95.50% accuracy with 95.24% F1-Score on the Cardio Train dataset containing 70,000 samples, and 98.90% accuracy with 98.86% F1-Score on the Heart Statlog Cleveland Hungary dataset, which includes 1,190 instances. Among the three datasets, the Heart Statlog Cleveland Hungary dataset exhibited the highest performance matrices, indicating the model’s enhanced ability to accurately identify positive cases within this dataset. To mitigate class imbalance, the SMOTEENN resampling technique was applied, which improved the model’s generalization performance across both majority and minority classes. Additionally, the integration of SHAP-based Explainable AI (XAI) techniques enhanced the interpretability of the model by offering clear and meaningful explanations of how individual features influenced the predictions. These insights contribute to greater transparency and trust in the model’s decision-making process. Overall, the results demonstrate the robustness and practical applicability of the proposed framework for early detection and clinical management of heart disease.

Keywords