EClinicalMedicine (Feb 2025)

Development and validation of an explainable machine learning model for mortality prediction among patients with infected pancreatic necrosisResearch in context

  • Caihong Ning,
  • Hui Ouyang,
  • Jie Xiao,
  • Di Wu,
  • Zefang Sun,
  • Baiqi Liu,
  • Dingcheng Shen,
  • Xiaoyue Hong,
  • Chiayan Lin,
  • Jiarong Li,
  • Lu Chen,
  • Shuai Zhu,
  • Xinying Li,
  • Fada Xia,
  • Gengwen Huang

Journal volume & issue
Vol. 80
p. 103074

Abstract

Read online

Summary: Background: Infected pancreatic necrosis (IPN) represents a severe complication of acute pancreatitis, commonly linked with mortality rates ranging from 15% to 35%. However, the present mortality prediction tools for IPN are limited and lack sufficient sensitivity and specificity. This study aims to develop and validate an explainable machine learning (ML) model for death prediction among patients with IPN. Methods: We performed a prospective cohort study of 344 patients with IPN consecutively enrolled from a large Chinese tertiary hospital from January 2011 to January 2023. Ten ML models were developed to predict 90-day mortality in these patients. A benchmarking test, involving nested resampling, automatic hyperparameter tuning and random search techniques, was conducted to select the ML model. Sequential forward selection method was employed to select the optimal feature subset from 31 candidate subsets to simplify the model and maximize predictive performance. The final model was internally validated with the 1000 bootstrap method and externally validated using an independent cohort of 132 patients with IPN retrospectively collected from another Chinese tertiary hospital from January 2018 to January 2023. The SHapley Additive exPlanations (SHAP) method was employed to interpret the model in terms of features importance and features effect. The final model constructed with optimal feature subset was deployed as an interactive web-based Shiny app. Findings: Random survival forest (RSF) model showed the best predictive performance than other 9 ML models (internal validation, C-index = 0.863 [95% CI: 0.854–0.875]; external validation, C-index = 0.857 [95% CI: 0.850–0.865]). Multiple organ failure, Acute Physiology and Chronic Health Examination II (APACHE II) score ≥20, duration of organ failure ≥21 days, bloodstream infection, time from onset to first intervention <30 days, Bedside Index of Severity in Acute Pancreatitis score ≥3, critical acute pancreatitis, age ≥ 50 years, and hemorrhage were 9 most important features associated with mortality. Furthermore, SHAP algorithm revealed insightful nonlinear interactive associations between important predictors and mortality, identifying 9 features pairs with high interaction SHAP value and clinical significance. Two interactive web-based Shiny apps were developed to enhance clinical practicability: https://rsfmodels.shinyapps.io/IPN_app/ for cases where the APACHE II score was available and https://rsfmodels.shinyapps.io/IPNeasy/ for cases where it was not. Interpretation: An explainable ML model for death prediction among IPN patients was feasible and effective, suggesting its superior potential in guiding clinical management and improving patient outcomes. Two publicly accessible web tools generated for the optimized model facilitated its utility in clinical settings. Funding: The Natural Science Foundation of Hunan Province (2023JJ30885), Postdoctoral Fellowship Program of CPSF (GZB20230872), The Youth Science Foundation of Xiangya Hospital (2023Q13), The Project Program of National Clinical Research Center for Geriatric Disorders of Xiangya Hospital (2021LNJJ19).

Keywords