IEEE Access (Jan 2024)
Using Machine Learning for Detection and Prediction of Chronic Diseases
Abstract
Heart attacks are a leading cause of mortality worldwide, necessitating the development of accurate predictive models to enhance early detection and intervention strategies. This study addresses the significant problem of class imbalance in medical datasets, specifically focusing on heart attack prediction using the Behavioral Risk Factor Surveillance System (BRFSS) dataset. To tackle this challenge, advanced machine learning (ML) methods are proposed to involve a refined dataset of 399,875 instances, with 47 significant features maintained through rigorous data cleaning and preparation. Balanced accuracy and macro-recall were chosen as primary metrics to ensure fair performance evaluation across classes in the imbalanced dataset. Our proposed system entails a detailed evaluation of various algorithms known for their effectiveness in managing class imbalance. The LGBM Classifier, XGB Classifier, and Logistic Regression (LR) are optimized using recursive feature elimination and hyperparameter tuning with Optuna. The results of this study are encapsulated in an ensemble model that significantly enhances predictive accuracy. The final model achieved 80.75% balanced accuracy and 79.97% recall for critical heart attack cases (class 1), along with an AUC score of 88.9%, indicating superior class distinction capability. Additionally, the application of SHAP (SHapley Additive exPlanations) analysis provided valuable insights into the contribution of each feature to heart attack likelihood, thus improving model transparency. This study’s successful integration of complex ML techniques with interpretability analyses like SHAP marks a substantial advance in early detection and intervention strategies in healthcare. It demonstrates the potential of sophisticated ML approaches for early heart attack detection and prevention, highlighting their value in improving outcomes for patients with chronic diseases. These findings suggest promising pathways for employing advanced analytical tools in healthcare to enhance patient care.
Keywords