Journal of International Medical Research (Nov 2024)

Can machine learning models improve the prediction of surgical site infection in abdominal surgery than traditional statistical models?

  • Pongsathorn Piebpien,
  • Amarit Tansawet,
  • Oraluck Pattanaprateep,
  • Anuchate Pattanateepapon,
  • Chumpon Wilasrusmee,
  • Gareth J. Mckay,
  • John Attia,
  • Ammarin Thakkinstian

DOI
https://doi.org/10.1177/03000605241293696
Journal volume & issue
Vol. 52

Abstract

Read online

Objective To externally validate by revision and update the study on the efficacy of nosocomial infection control (SENIC) model of surgical site infection (SSI) using logistic regression (LR) and machine learning (ML) approaches. Methods A retrospective analysis of hospital database-derived data from patients that had undergone gastrointestinal, colorectal and hernia surgeries (identified by ICD-9-CM). The SENIC index was calculated and fitted in an LR. MLs were developed using decision-tree (DT), random forest (RF), extreme-gradient-boosting (XGBoost) and Naïve Bayes (NB). Results The prevalence of an SSI was 3.21% (404 of 12 596 surgeries; 95% confidence interval [CI] 2.91%, 3.53%). The C-statistic for the original SENIC model was 0.668 (95% CI 0.648, 0.688) with an observed/expected (O/E) ratio of 0.998 (interquartile range [IQR] 0.750, 1.047). An updated-SENIC-LR model with six predictors had a C-statistic of 0.768 (95% CI 0.745, 0.790) and O/E ratio of 0.999 (IQR 0.976, 1.004). The performance of MLs considering 14 predictors was poorer than the updated-SENIC-LR with C-statistics of 0.679, 0.675, 0.656 and 0.651 for NB, XGBoost, RF and DT, respectively. Overfitting was detected for ML approaches, particularly for DT, RF and XGBoost. Conclusion The updated-SENIC-LR model and NB may be useful for monitoring SSI risk following abdominal surgery.