Therapeutics and Clinical Risk Management (Sep 2021)
Optimized Machine Learning Models to Predict In-Hospital Mortality for Patients with ST-Segment Elevation Myocardial Infarction
Abstract
Jia Zhao,1,2 Pengyu Zhao,3 Chunjie Li,2 Yonghong Hou3 1Graduate School, Tianjin Medical University, Tianjin, 300070, People’s Republic of China; 2Department of Cardiology, Tianjin Chest Hospital, Tianjin, 300222, People’s Republic of China; 3School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, People’s Republic of ChinaCorrespondence: Chunjie Li; Yonghong Hou Tel +86(022)88185135Fax +86(022)88185338Email [email protected]; [email protected]: This study aimed to optimize machine learning (ML) models for predicting in-hospital mortality in patients with ST-segment elevation acute myocardial infarction (STEMI).Patients and Methods: A total of 5708 STEMI patients were enrolled and divided into two groups according to patients’ hospital outcomes. Both groups were randomly split into a training set (75%) and a testing set (25%). Four ML models were trained with data, which applied random under-sampling (RUS). The performance of optimized ML models was evaluated with respect to accuracy, sensitivity, specificity, G-mean and AUC. Two sets of features in chronological order were considered: a full set that included all variables during hospitalization and a simplified set that only included variables prior to reperfusion therapy, and the performance of the prediction models trained with these two sets of features was compared.Results: For the comprehensive metric – G-mean, the models trained with RUS outperformed those without, 80.54% vs 23.31% on average in the full set and 75.72% vs 35.76% on average in the simplified set. For models trained with the full set, the SVM achieved the best performance with 85.62% accuracy, 84.21% sensitivity, 85.66% specificity, 84.93% G-mean and 0.919 AUC. For models trained with the simplified set, the SVM achieved 83.48% G-mean, which was comparable to the models trained using the full set. For the most critical metric – sensitivity, the SVM trained using the simplified set achieved 89.47%, which even exceed the SVM (84.21%), DT (81.58%) and RF (81.58%) trained using the full set.Conclusion: Applying RUS can improve the performance of prediction models, and the models trained with simplified set, which only included variables prior to reperfusion therapy can accurately predict high-risk patients.Keywords: STEMI, in-hospital mortality, prediction model, optimized machine learning algorithm, random under-sampling