BMC Gastroenterology (Apr 2024)

Interpretable machine learning-based clinical prediction model for predicting lymph node metastasis in patients with intrahepatic cholangiocarcinoma

  • Hui Xie,
  • Tao Hong,
  • Wencai Liu,
  • Xiaodong Jia,
  • Le Wang,
  • Huan Zhang,
  • Chan Xu,
  • Xiaoke Zhang,
  • Wen-Le Li,
  • Quan Wang,
  • Chengliang Yin,
  • Xu Lv

DOI
https://doi.org/10.1186/s12876-024-03223-w
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Objective Prediction of lymph node metastasis (LNM) for intrahepatic cholangiocarcinoma (ICC) is critical for the treatment regimen and prognosis. We aim to develop and validate machine learning (ML)-based predictive models for LNM in patients with ICC. Methods A total of 345 patients with clinicopathological characteristics confirmed ICC from Jan 2007 to Jan 2019 were enrolled. The predictors of LNM were identified by the least absolute shrinkage and selection operator (LASSO) and logistic analysis. The selected variables were used for developing prediction models for LNM by six ML algorithms, including Logistic regression (LR), Gradient boosting machine (GBM), Extreme gradient boosting (XGB), Random Forest (RF), Decision tree (DT), Multilayer perceptron (MLP). We applied 10-fold cross validation as internal validation and calculated the average of the areas under the receiver operating characteristic (ROC) curve to measure the performance of all models. A feature selection approach was applied to identify importance of predictors in each model. The heat map was used to investigate the correlation of features. Finally, we established a web calculator using the best-performing model. Results In multivariate logistic regression analysis, factors including alcoholic liver disease (ALD), smoking, boundary, diameter, and white blood cell (WBC) were identified as independent predictors for LNM in patients with ICC. In internal validation, the average values of AUC of six models ranged from 0.820 to 0.908. The XGB model was identified as the best model, the average AUC was 0.908. Finally, we established a web calculator by XGB model, which was useful for clinicians to calculate the likelihood of LNM. Conclusion The proposed ML-based predicted models had a good performance to predict LNM of patients with ICC. XGB performed best. A web calculator based on the ML algorithm showed promise in assisting clinicians to predict LNM and developed individualized medical plans.

Keywords