An Efficient Computational Risk Prediction Model of Heart Diseases Based on Dual-Stage Stacked Machine Learning Approaches

Subhash Mondal; Ranjan Maity; Yachang Omo; Soumadip Ghosh; Amitava Nag

doi:10.1109/ACCESS.2024.3350996

IEEE Access (Jan 2024)

An Efficient Computational Risk Prediction Model of Heart Diseases Based on Dual-Stage Stacked Machine Learning Approaches

Subhash Mondal,
Ranjan Maity,
Yachang Omo,
Soumadip Ghosh,
Amitava Nag

Affiliations

Subhash Mondal: ORCiD; Department of Computer Science and Engineering, Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India
Ranjan Maity: ORCiD; Department of Computer Science and Engineering, Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India
Yachang Omo: ORCiD; Department of Civil Engineering, Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India
Soumadip Ghosh: ORCiD; Department of Computer Science and Engineering, Future Institute of Technology, Kolkata, West Bengal, India
Amitava Nag: ORCiD; Department of Computer Science and Engineering, Central Institute of Technology Kokrajhar, Kokrajhar, Assam, India

DOI: https://doi.org/10.1109/ACCESS.2024.3350996
Journal volume & issue: Vol. 12
pp. 7255 – 7270

Abstract

Read online

Cardiovascular diseases (CVDs) continue to be a prominent cause of global mortality, necessitating the development of effective risk prediction models to combat the rise in heart disease (HD) mortality rates. This work presents a novel dual-stage stacked machine learning (ML) based computational risk prediction model for cardiac disorders. Leveraging a dataset that includes eleven significant characteristics from 1190 patients from five distinct sources, five ML classifiers are utilized to create the initial prediction model. To ensure robustness and generalizability, the classifiers are cross-validated ten times. The model performance is optimized by employing two hyperparameter tuning approaches: RandomizedSearchCV and GridSearchCV. These methods aim to find the optimal estimator values. The highest-performing models, specifically Random Forest, Extreme Gradient Boost, and Decision Tree undergo additional refinement using a stacking ensemble technique. The stacking model, which leverages the capabilities of the three models, attains a remarkable accuracy rate of 96%, a recall value of 0.98, and a ROC-AUC score of 0.96. Notably, the rate of false-negative results is below 1%, demonstrating a high level of accuracy and a non-overfitted model. To evaluate the model’s stability and repeatability, a comparable dataset consisting of 1000 occurrences is employed. The model consistently achieves an accuracy of 96.88% under identical experimental settings. This highlights the strength and dependability of the suggested computer model for predicting the risk of cardiac illnesses. The outcomes indicate that employing this two-step stacking ML method shows potential for prompt and precise diagnosis, hence aiding the worldwide endeavor to decrease fatalities caused by heart disease.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords