BMC Medical Informatics and Decision Making (Sep 2023)

A generalizable and interpretable model for mortality risk stratification of sepsis patients in intensive care unit

  • Jinhu Zhuang,
  • Haofan Huang,
  • Song Jiang,
  • Jianwen Liang,
  • Yong Liu,
  • Xiaxia Yu

DOI
https://doi.org/10.1186/s12911-023-02279-0
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Purpose This study aimed to construct a mortality model for the risk stratification of intensive care unit (ICU) patients with sepsis by applying a machine learning algorithm. Methods Adult patients who were diagnosed with sepsis during admission to ICU were extracted from MIMIC-III, MIMIC-IV, eICU, and Zigong databases. MIMIC-III was used for model development and internal validation. The other three databases were used for external validation. Our proposed model was developed based on the Extreme Gradient Boosting (XGBoost) algorithm. The generalizability, discrimination, and validation of our model were evaluated. The Shapley Additive Explanation values were used to interpret our model and analyze the contribution of individual features. Results A total of 16,741, 15,532, 22,617, and 1,198 sepsis patients were extracted from the MIMIC-III, MIMIC-IV, eICU, and Zigong databases, respectively. The proposed model had an area under the receiver operating characteristic curve (AUROC) of 0.84 in the internal validation, which outperformed all the traditional scoring systems. In the external validations, the AUROC was 0.87 in the MIMIC-IV database, better than all the traditional scoring systems; the AUROC was 0.83 in the eICU database, higher than the Simplified Acute Physiology Score III and Sequential Organ Failure Assessment (SOFA),equal to 0.83 of the Acute Physiology and Chronic Health Evaluation IV (APACHE-IV), and the AUROC was 0.68 in the Zigong database, higher than those from the systemic inflammatory response syndrome and SOFA. Furthermore, the proposed model showed the best discriminatory and calibrated capabilities and had the best net benefit in each validation. Conclusions The proposed algorithm based on XGBoost and SHAP-value feature selection had high performance in predicting the mortality of sepsis patients within 24 h of ICU admission.

Keywords