BMC Gastroenterology (Oct 2024)

Analyzing risk factors and constructing a predictive model for superficial esophageal carcinoma with submucosal infiltration exceeding 200 micrometers

  • Yutong Cui,
  • Zichen Luo,
  • Xiaobo Wang,
  • Shiqi Liang,
  • Guangbing Hu,
  • Xinrui Chen,
  • Ji Zuo,
  • Lu Zhou,
  • Haiyang Guo,
  • Xianfei Wang

DOI
https://doi.org/10.1186/s12876-024-03442-1
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Objective Submucosal infiltration of less than 200 μm is considered an indication for endoscopic surgery in cases of superficial esophageal cancer and precancerous lesions. This study aims to identify the risk factors associated with submucosal infiltration exceeding 200 micrometers in early esophageal cancer and precancerous lesions, as well as to establish and validate an accompanying predictive model. Methods Risk factors were identified through least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression. Various machine learning (ML) classification models were tested to develop and evaluate the most effective predictive model, with Shapley Additive Explanations (SHAP) employed for model visualization. Results Predictive factors for early esophageal invasion into the submucosa included endoscopic ultrasonography or magnifying endoscopy> SM1(P<0.001,OR = 3.972,95%CI 2.161–7.478), esophageal wall thickening(P<0.001,OR = 12.924,95%CI,5.299–33.96), intake of pickled foods(P=0.04,OR = 1.837,95%CI,1.03–3.307), platelet-lymphocyte ratio(P<0.001,OR = 0.284,95%CI,0.137–0.556), tumor size(P<0.027,OR = 2.369,95%CI,1.128–5.267), the percentage of circumferential mucosal defect(P<0.001,OR = 5.286,95%CI,2.671–10.723), and preoperative pathological type(P<0.001,OR = 4.079,95%CI,2.254–7.476). The logistic regression model constructed from the identified risk factors was found to be the optimal model, demonstrating high efficacy with an area under the curve (AUC) of 0.922 in the training set, 0.899 in the validation set, and 0.850 in the test set. Conclusion A logistic regression model complemented by SHAP visualizations effectively identifies early esophageal cancer reaching 200 micrometers into the submucosa.

Keywords