Applied Sciences (Jul 2022)

Mixed Machine Learning Approach for Efficient Prediction of Human Heart Disease by Identifying the Numerical and Categorical Features

  • Ghulab Nabi Ahmad,
  • Shafiullah,
  • Hira Fatima,
  • Mohamed Abbas,
  • Obaidur Rahman,
  • Imdadullah,
  • Mohammed S. Alqahtani

DOI
https://doi.org/10.3390/app12157449
Journal volume & issue
Vol. 12, no. 15
p. 7449

Abstract

Read online

Heart disease is a danger to people’s health because of its prevalence and high mortality risk. Predicting cardiac disease early using a few simple physical indications collected from a routine physical examination has become difficult. Clinically, it is critical and sensitive for the signs of heart disease for accurate forecasts and concrete steps for future diagnosis. The manual analysis and prediction of a massive volume of data are challenging and time-consuming. In this paper, a unique heart disease prediction model is proposed to predict heart disease correctly and rapidly using a variety of bodily signs. A heart disease prediction algorithm based on the analysis of the predictive models’ classification performance on combined datasets and the train-test split technique is presented. Finally, the proposed technique’s training results are compared with the previous works. For the Cleveland, Switzerland, Hungarian, and Long Beach VA heart disease datasets, accuracy, precision, recall, F1-score, and ROC-AUC curves are used as the performance indicators. The analytical outcomes for Random Forest Classifiers (RFC) of the combined heart disease datasets are F1-score 100%, accuracy 100%, precision 100%, recall 100%, and the ROC-AUC 100%. The Decision Tree Classifiers for pooled heart disease datasets are F1-score 100%, accuracy 98.80%, precision 98%, recall 99%, ROC-AUC 99%, and for RFC and Gradient Boosting Classifiers (GBC), the ROC-AUC gives 100% performance. The performances of the machine learning algorithms are improved by using five-fold cross validation. Again, the Stacking CV Classifier is also used to improve the performances of the individual machine learning algorithms by combining two and three techniques together. In this paper, several reduction methods are incorporated. It is found that the accuracy of the RFC classification algorithm is high. Moreover, the developed method is efficient and reliable for predicting heart disease.

Keywords