Journal of Clinical Medicine (Jan 2023)

Developing an Interpretable Machine Learning Model to Predict in-Hospital Mortality in Sepsis Patients: A Retrospective Temporal Validation Study

  • Shuhe Li,
  • Ruoxu Dou,
  • Xiaodong Song,
  • Ka Yin Lui,
  • Jinghong Xu,
  • Zilu Guo,
  • Xiaoguang Hu,
  • Xiangdong Guan,
  • Changjie Cai

DOI
https://doi.org/10.3390/jcm12030915
Journal volume & issue
Vol. 12, no. 3
p. 915

Abstract

Read online

Background: Risk stratification plays an essential role in the decision making for sepsis management, as existing approaches can hardly satisfy the need to assess this heterogeneous population. We aimed to develop and validate a machine learning model to predict in-hospital mortality in critically ill patients with sepsis. Methods: Adult patients fulfilling the definition of Sepsis-3 were included at a large tertiary medical center. Relevant clinical features were extracted within the first 24 h in ICU, re-classified into different genres, and utilized for model development under three strategies: “Basic + Lab”, “Basic + Intervention”, and “Whole” feature sets. Extreme gradient boosting (XGBoost) was compared with logistic regression (LR) and established severity scores. Temporal validation was conducted using admissions from 2017 to 2019. Results: The final cohort included 24,272 patients, of which 4013 patients formed the test cohort for temporal validation. The trained and fine-tuned XGBoost model with the whole feature set showed the best discriminatory ability in the test cohort with AUROC as 0.85, significantly higher than the XGBoost “Basic + Lab” model (0.83), the LR “Whole” model (0.82), SOFA (0.63), SAPS-II (0.73), and LODS score (0.74). The performance in varying subgroups remained robust, and predictors, such as increased urine output and supplemental oxygen therapy, were crucially correlated with improved survival when interpretability was explored. Conclusions: We developed and validated a novel XGBoost-based model and demonstrated significantly improved performance to LR and other scores in predicting the mortality risks of sepsis patients in the hospital using features in the first 24 h.

Keywords