Intelligent Medicine (May 2024)
A hybrid system to predict brain stroke using a combined feature selection and classifier
Abstract
Background: Brain stroke is a serious health issue that requires timely and accurate prediction for effective treatment and prevention. This study described a hybrid system that used the best feature selection method and classifier to predict brain stroke. Methods: The Stroke Prediction Dataset from Kaggle was used for this study. Synthetic minority over-sampling technique (SMOTE) analysis was used to accomplish class balancing. Accuracy, sensitivity, specificity, precision, and the F-Measure were the main performance parameters considered for investigation. To determine the best combination for predicting brain stroke, the performance of five classifiers, Naïve Bayes (NB), support vector machine (SVM), random forest (RF), adaptive boosting (Adaboost), and extreme gradient boosting (XGBoost), was compared along with three feature selection techniques, mutual information (MI), Pearson correlation (PC), and feature importance (FI). The performance parameters were assessed using k-fold cross-validation. Results: The hybrid system proposed in this study identified a reduced set of features that were able to effectively predict brain stroke. FI provided a feature reduction ratio of 36.3%. The most successful hybrid system for predicting brain stroke used FI as the feature selection technique and RF as the classifier, achieving an accuracy of 97.17%. Conclusion: The proposed system predicted brain stroke with high accuracy. These findings could be used to inform the early detection and prevention of brain stroke, allowing healthcare professionals to provide timely and targeted care to at-risk patients.