​​​​​​​​Infectious Diseases & Immunity (Apr 2023)

Machine Learning-Based Scoring System for Early Prognosis Evaluation of Patients with Coronavirus Disease 2019

  • Hao-Min Zhang,
  • Lei Shi,
  • Hao-Ran Chen,
  • Jun-Dong Zhang,
  • Ge-Liang Liu,
  • Zi-Ning Wang,
  • Peng Zhi,
  • Run-Sheng Wang,
  • Zhuo-Yang Li,
  • Xi-Meng Chen,
  • Fu-Sheng Wang,
  • Xue-Chun Lu,
  • Haijuan Wang

DOI
https://doi.org/10.1097/ID9.0000000000000077
Journal volume & issue
Vol. 3, no. 2
pp. 83 – 89

Abstract

Read online

Abstract. Background. The global spread of coronavirus disease 2019 (COVID-19) continues to threaten human health security, exerting considerable pressure on healthcare systems worldwide. While prognostic models for COVID-19 hospitalized or intensive care patients are currently available, prognostic models developed for large cohorts of thousands of individuals are still lacking. Methods. Between February 4 and April 16, 2020, we enrolled 3,974 patients admitted with COVID-19 disease in the Wuhan Huo-Shen-Shan Hospital and the Maternal and Child Hospital, Hubei Province, China. (1) Screening of key prognostic factors: A univariate Cox regression analysis was performed on 2,649 patients in the training set, and factors affecting prognosis were initially screened. Subsequently, a random survival forest model was established through machine analysis to further screen for factors that are important for prognosis. Finally, multivariate Cox regression analysis was used to determine the synergy among various factors related to prognosis. (2) Establishment of a scoring system: The nomogram algorithm established a COVID-19 patient death risk assessment scoring system for the nine selected key prognostic factors, calculated the C index, drew calibration curves and drew training set patient survival curves. (3) Verification of the scoring system: The scoring system assessed 1,325 patients in the test set, splitting them into high- and low-risk groups, calculated the C-index, and drew calibration and survival curves. Results. The cross-sectional study found that age, clinical classification, sex, pulmonary insufficiency, hypoproteinemia, and four other factors (underlying diseases: blood diseases, malignant tumor; complications: digestive tract bleeding, heart dysfunction) have important significance for the prognosis of the enrolled patients with COVID-19. Herein, we report the discovery of the effects of hypoproteinemia and hematological diseases on the prognosis of COVID-19. Meanwhile, the scoring system established here can effectively evaluate objective scores for the early prognoses of patients with COVID-19 and can divide them into high- and low-risk groups (using a scoring threshold of 117.77, a score below which is considered low risk). The efficacy of the system was better than that of clinical classification using the current COVID-19 guidelines (C indexes, 0.95 vs. 0.89). Conclusions. Age, clinical typing, sex, pulmonary insufficiency, hypoproteinemia, and four other factors were important for COVID-19 survival. Compared with general statistical methods, this method can quickly and accurately screen out the relevant factors affecting prognosis, provide an order of importance, and establish a scoring system based on the nomogram model, which is of great clinical significance.