PLoS ONE (Jan 2024)

Systematic evaluation of machine learning models for postoperative surgical site infection prediction.

  • Anna M van Boekel,
  • Siri L van der Meijden,
  • Sesmu M Arbous,
  • Rob G H H Nelissen,
  • Karin E Veldkamp,
  • Emma B Nieswaag,
  • Kim F T Jochems,
  • Jeroen Holtz,
  • Annekee van IJlzinga Veenstra,
  • Jeroen Reijman,
  • Ype de Jong,
  • Harry van Goor,
  • Maryse A Wiewel,
  • Jan W Schoones,
  • Bart F Geerts,
  • Mark G J de Boer

DOI
https://doi.org/10.1371/journal.pone.0312968
Journal volume & issue
Vol. 19, no. 12
p. e0312968

Abstract

Read online

BackgroundSurgical site infections (SSIs) lead to increased mortality and morbidity, as well as increased healthcare costs. Multiple models for the prediction of this serious surgical complication have been developed, with an increasing use of machine learning (ML) tools.ObjectiveThe aim of this systematic review was to assess the performance as well as the methodological quality of validated ML models for the prediction of SSIs.MethodsA systematic search in PubMed, Embase and the Cochrane library was performed from inception until July 2023. Exclusion criteria were the absence of reported model validation, SSIs as part of a composite adverse outcome, and pediatric populations. ML performance measures were evaluated, and ML performances were compared to regression-based methods for studies that reported both methods. Risk of bias (ROB) of the studies was assessed using the Prediction model Risk of Bias Assessment Tool.ResultsOf the 4,377 studies screened, 24 were included in this review, describing 85 ML models. Most models were only internally validated (81%). The C-statistic was the most used performance measure (reported in 96% of the studies) and only two studies reported calibration metrics. A total of 116 different predictors were described, of which age, steroid use, sex, diabetes, and smoking were most frequently (100% to 75%) incorporated. Thirteen studies compared ML models to regression-based models and showed a similar performance of both modelling methods. For all included studies, the overall ROB was high or unclear.ConclusionsA multitude of ML models for the prediction of SSIs are available, with large variability in performance. However, most models lacked external validation, performance was reported limitedly, and the risk of bias was high. In studies describing both ML models and regression-based models, one modelling method did not outperform the other.