IEEE Access (Jan 2021)

An Imbalanced-Data Processing Algorithm for the Prediction of Heart Attack in Stroke Patients

  • Meng Wang,
  • Xinghua Yao,
  • Yixiang Chen

DOI
https://doi.org/10.1109/ACCESS.2021.3057693
Journal volume & issue
Vol. 9
pp. 25394 – 25404

Abstract

Read online

Early predicting heart attack out of stroke patients in a view of data analysis is an approach to reduce a high mortality rate. Stroke-patient data in Intensive Care Unit are imbalanced due to that stroke patients with heart attack are in the minority of stroke patients. How to predict heart attack in the stroke-patient data becomes a challenge. For processing the imbalanced data, this paper designs an algorithm by leveraging random undersampling, clustering and oversampling techniques, which is called undersampling-clustering-oversampling algorithm (shortly, UCO algorithm). The UCO algorithm generates nearly balanced data which are utilized to train machine-learning models for predicting heart attack. Over the database of Medical Information Mart for Intensive Care III, extensive experiments are conducted to evaluate the UCO algorithm. A setting of undersampling number of 120 in the algorithm UCO, denoted UCO(120), shows good performance in helping machine-learning classifiers extract features. Five classifiers are separately deployed to predict heart attack based on outputs of the UCO(120). Our results show that random forest classifier achieves the best predicting performance with an accuracy of 70.29%, and precision of 70.05%. It could be well-predicted using UCO(120) and random forest that whether a stroke patient will have heart attack or not.

Keywords