Meta-Ensemble Learning for Heart Disease Prediction: A Stacking-Based Approach With Explainable AI

Mehwish Naz; Aqsa Khalid; Abdul Hameed; Rabia Taj; Waleed Mumtaz; Faiz Abdullah Alotaibi; Mrim M. Alnfiai

doi:10.1109/access.2025.3588683

IEEE Access (Jan 2025)

Meta-Ensemble Learning for Heart Disease Prediction: A Stacking-Based Approach With Explainable AI

Mehwish Naz,
Aqsa Khalid,
Abdul Hameed,
Rabia Taj,
Waleed Mumtaz,
Faiz Abdullah Alotaibi,
Mrim M. Alnfiai

Affiliations

Mehwish Naz: Department of Software Engineering, Iqra University Islamabad Campus (IUIC), Islamabad, Pakistan
Aqsa Khalid: Department of Software Engineering, Iqra University Islamabad Campus (IUIC), Islamabad, Pakistan
Abdul Hameed: ORCiD; Department of Computer Science, The University of Chenab at Gujrat, Gujrat, Pakistan
Rabia Taj: Department of Software Engineering, Iqra University Islamabad Campus (IUIC), Islamabad, Pakistan
Waleed Mumtaz: Department of Software Engineering, Iqra University Islamabad Campus (IUIC), Islamabad, Pakistan
Faiz Abdullah Alotaibi: ORCiD; Department of Information Science, College of Humanities and Social Sciences, King Saud University, Riyadh, Saudi Arabia
Mrim M. Alnfiai: ORCiD; Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia

DOI: https://doi.org/10.1109/access.2025.3588683
Journal volume & issue: Vol. 13
pp. 137271 – 137290

Abstract

Read online

Heart disease remains a leading cause of mortality globally, necessitating the development of accurate, interpretable, and reliable predictive models to support early diagnosis and enhance patient care. The work demonstrates a robust machine learning framework for classifying heart disease on three widely used benchmark datasets for heart disease such as Heart_2020_Cleaned, Heart Statlog Cleveland Hungary, and Cardio Train. A comparative analysis was conducted across these datasets to evaluate model performance under varying data characteristics. Similarly, to ensure optimal model performance and generalizability, k-fold cross-validation was employed for hyperparameter tuning, enabling systematic evaluation and fine-tuning of model parameters while reducing the risk of overfitting. The proposed model attained an accuracy of 97.72% with a F1-Score of 97.89% on the Heart_2020_Cleaned dataset comprising 319,795 samples, 95.50% accuracy with 95.24% F1-Score on the Cardio Train dataset containing 70,000 samples, and 98.90% accuracy with 98.86% F1-Score on the Heart Statlog Cleveland Hungary dataset, which includes 1,190 instances. Among the three datasets, the Heart Statlog Cleveland Hungary dataset exhibited the highest performance matrices, indicating the model’s enhanced ability to accurately identify positive cases within this dataset. To mitigate class imbalance, the SMOTEENN resampling technique was applied, which improved the model’s generalization performance across both majority and minority classes. Additionally, the integration of SHAP-based Explainable AI (XAI) techniques enhanced the interpretability of the model by offering clear and meaningful explanations of how individual features influenced the predictions. These insights contribute to greater transparency and trust in the model’s decision-making process. Overall, the results demonstrate the robustness and practical applicability of the proposed framework for early detection and clinical management of heart disease.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords