Chinese Medical Journal (Mar 2020)

Prediction of fatal adverse prognosis in patients with fever-related diseases based on machine learning: A retrospective study

  • Chun-Hong Zhao,
  • Hui-Tao Wu,
  • He-Bin Che,
  • Ya-Nan Song,
  • Yu-Zhuo Zhao,
  • Kai-Yuan Li,
  • Hong-Ju Xiao,
  • Yong-Zhi Zhai,
  • Xin Liu,
  • Hong-Xi Lu,
  • Tan-Shi Li,
  • Yuan-Yuan Ji

DOI
https://doi.org/10.1097/CM9.0000000000000675
Journal volume & issue
Vol. 133, no. 5
pp. 583 – 589

Abstract

Read online

Abstract. Background. Fever is the most common chief complaint of emergency patients. Early identification of patients at an increasing risk of death may avert adverse outcomes. The aim of this study was to establish an early prediction model of fatal adverse prognosis of fever patients by extracting key indicators using big data technology. Methods. A retrospective study of patients’ data was conducted using the Emergency Rescue Database of Chinese People's Liberation Army General Hospital. Patients were divided into the fatal adverse prognosis group and the good prognosis group. The commonly used clinical indicators were compared. Recursive feature elimination (RFE) method was used to determine the optimal number of the included variables. In the training model, logistic regression, random forest, adaboost and bagging were selected. We also collected the emergency room data from December 2018 to December 2019 with the same inclusion and exclusion criterion. The performance of the model was evaluated by accuracy, F1-score, precision, sensitivity and the areas under receiver operator characteristic curves (ROC-AUC). Results. The accuracy of logistic regression, decision tree, adaboost and bagging was 0.951, 0.928, 0.924, and 0.924, F1-scores were 0.938, 0.933, 0.930, and 0.930, the precision was 0.943, 0.938, 0.937, and 0.937, ROC-AUC were 0.808, 0.738, 0.736, and 0.885, respectively. ROC-AUC of ten-fold cross-validation in logistic and bagging models were 0.80 and 0.87, respectively. The top six coefficients and odds ratio (OR) values of the variables in the Logistic regression were cardiac troponin T (CTnT) (coefficient=0.346, OR = 1.413), temperature (T) (coefficient=0.235, OR = 1.265), respiratory rate (RR) (coefficient= –0.206,OR = 0.814), serum kalium (K) (coefficient=0.137, OR = 1.146), pulse oxygen saturation (SPO2) (coefficient= –0.101, OR = 0.904), and albumin (ALB) (coefficient= –0.043, OR = 0.958). The weights of the top six variables in the bagging model were: CTnT, RR, lactate dehydrogenase, serum amylase, heartrate, and systolic blood pressure. Conclusions. The main clinical indicators of concern included CTnT, RR, SPO2, T, ALB and K. The bagging model and logistic regression model had better diagnostic performance comprehesively. Those may be conducive to the early identification of critical patients with fever by physicians.