Frontiers in Oncology (Jun 2024)

Construction of a predictive model for postoperative hospitalization time in colorectal cancer patients based on interpretable machine learning algorithm: a prospective preliminary study

  • Zhongjian Wen,
  • Zhongjian Wen,
  • Yiren Wang,
  • Yiren Wang,
  • Shouying Chen,
  • Shouying Chen,
  • Yunfei Li,
  • Hairui Deng,
  • Hairui Deng,
  • Haowen Pang,
  • Shengmin Guo,
  • Ping Zhou,
  • Ping Zhou,
  • Shiqin Zhu

DOI
https://doi.org/10.3389/fonc.2024.1384931
Journal volume & issue
Vol. 14

Abstract

Read online

ObjectiveThis study aims to construct a predictive model based on machine learning algorithms to assess the risk of prolonged hospital stays post-surgery for colorectal cancer patients and to analyze preoperative and postoperative factors associated with extended hospitalization.MethodsWe prospectively collected clinical data from 83 colorectal cancer patients. The study included 40 variables (comprising 39 predictor variables and 1 target variable). Important variables were identified through variable selection via the Lasso regression algorithm, and predictive models were constructed using ten machine learning models, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, Light Gradient Boosting Machine, KNN, and Extreme Gradient Boosting, Categorical Boosting, Artificial Neural Network and Deep Forest. The model performance was evaluated using Bootstrap ROC curves and calibration curves, with the optimal model selected and further interpreted using the SHAP explainability algorithm.ResultsTen significantly correlated important variables were identified through Lasso regression, validated by 1000 Bootstrap resamplings, and represented through Bootstrap ROC curves. The Logistic Regression model achieved the highest AUC (AUC=0.99, 95% CI=0.97–0.99). The explainable machine learning algorithm revealed that the distance walked on the third day post-surgery was the most important variable for the LR model.ConclusionThis study successfully constructed a model predicting postoperative hospital stay duration using patients’ clinical data. This model promises to provide healthcare professionals with a more precise prediction tool in clinical practice, offering a basis for personalized nursing interventions, thereby improving patient prognosis and quality of life and enhancing the efficiency of medical resource utilization.

Keywords