Discover Oncology (Sep 2024)

Utilizing machine learning algorithms for predicting risk factors for bone metastasis from right-sided colon carcinoma after complete mesocolic excision: a 10-year retrospective multicenter study

  • Yuan Liu,
  • Yuankun Liu,
  • Shuting Wang,
  • Sen Niu,
  • Langyu Wang,
  • Jiaheng Xie,
  • Ning Zhao,
  • Songyun Zhao,
  • Chao Cheng,
  • Teng Dai

DOI
https://doi.org/10.1007/s12672-024-01327-z
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Background Bone metastasis (BM) occurs when colon cancer cells disseminate from the primary tumor site to the skeletal system via the bloodstream or lymphatic system. The emergence of such bone metastases typically heralds a significantly poor prognosis for the patient. This study's primary aim is to develop a machine learning model to identify patients at elevated risk of bone metastasis among those with right-sided colon cancer undergoing complete mesocolonectomy (CME). Patients and methods The study cohort comprised 1,151 individuals diagnosed with right-sided colon cancer, with a subset of 73 patients presenting with bone metastases originating from the colon. We used univariate and multivariate regression analyses as well as four machine learning algorithms to screen variables for 38 characteristic variables such as patient demographic characteristics and surgical information. The study employed four distinct machine learning algorithms, namely, extreme gradient boosting (XGBoost), random forest (RF), support vector machine (SVM), and k-nearest neighbor algorithm (KNN), to develop the predictive model. Additionally, the model was assessed using receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA), while Shapley additive explanation (SHAP) was utilized to visualize and analyze the model. Results The XGBoost algorithm performed the best performance among the four prediction models. In the training set, the XGBoost algorithm had an area under curve (AUC) value of 0.973 (0.953–0.994), an accuracy of 0.925 (0.913–0.936), a sensitivity of 0.921 (0.902–0.940), and a specificity of 0.908 (0.894–0.922). In the validation set, the XGBoost algorithm had an AUC value of 0.922 (0.833–0.995), an accuracy of 0.908 (0.889–0.926), a sensitivity of 0.924 (0.873–0.975), and a specificity of 0.883 (0.810–0.956). Furthermore, the AUC value of 0.83 for the external validation set suggests that the XGBoost prediction model possesses strong extrapolation capabilities. The results of SHAP analysis identified alkaline phosphatase (ALP) levels, tumor size, invasion depth, lymph node metastasis, lung metastasis, and postoperative neutrophil-to-lymphocyte ratio (NLR) levels as significant risk factors for BM from right-sided colon cancer subsequent to CME. Conclusion The prediction model for BM from right-sided colon cancer developed using the XGBoost machine learning algorithm in this study is both highly precise and clinically valuable.

Keywords