Archives of Medical Science (Jun 2024)
Development and validation of an ensemble learning risk model for sepsis after abdominal surgery
Abstract
Introduction Although their importance has gained attention, the clinical applications of methods for screening patients at high risk of sepsis after abdominal surgery have been restricted. Therefore, we aimed to develop and validate models for screening patients at high risk of sepsis after abdominal surgery based on machine learning with routine variables. Material and methods The whole dataset was composed of three representative academic hospitals in China and the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. Routine clinical variables were implemented for model development. The Boruta algorithm was applied for feature selection. Afterwards, ensemble learning and eight other conventional algorithms were used for model fitting and validation based on all features and selected features. The area under the receiver operating characteristic curves (ROC AUC), sensitivity, specificity, F1 score, accuracy, net reclassification index (NRI), integrated discrimination improvement (IDI), decision curve analysis (DCA), and calibration curves were used for model evaluation. Results A total of 955 patients undergoing abdominal surgery were finally analyzed (sepsis: 285, non-sepsis: 670). After feature selection, the ensemble learning model constructed by integrating k-Nearest Neighbor (KNN) and Support Vector Machine (SVM) yielded the ROC AUC of 0.892 (0.841–0.944) and accuracy of 85.0% on the test data, and the ROC AUC of 0.782 (0.727–0.838) and accuracy of 68.1% on the validation data, which performed best. Albumin, ASA score, neutrophil-lymphocyte ratio, age, and glucose were the top features associated with postoperative sepsis by KNN and SVM. Conclusions We developed a new and potential generalizable model to preoperatively screen patients at high risk of sepsis after abdominal surgery, with the advantages of a representative training cohort and routine variables.
Keywords