BMC Cancer (Dec 2024)

Pathomics-based machine learning models for predicting pathological complete response and prognosis in locally advanced rectal cancer patients post-neoadjuvant chemoradiotherapy: insights from two independent institutional studies

  • Yiyi Zhang,
  • Ying Huang,
  • Meifang Xu,
  • Jiazheng Zhuang,
  • Zhibo Zhou,
  • Shaoqing Zheng,
  • Bingwang Zhu,
  • Guoxian Guan,
  • Hong Chen,
  • Xing Liu

DOI
https://doi.org/10.1186/s12885-024-13328-w
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Accurate prediction of pathological complete response (pCR) and disease-free survival (DFS) in locally advanced rectal cancer (LARC) patients undergoing neoadjuvant chemoradiotherapy (NCRT) is essential for formulating effective treatment plans. This study aimed to construct and validate the machine learning (ML) models to predict pCR and DFS using pathomics. Method A retrospective analysis was conducted on 294 patients who received NCRT from two independent institutions. Pathomics from pre-NCRT H&E stains were extracted, and five ML models were developed and validated across two centers using ROC, Kaplan-Meier, time-dependent ROC, and nomogram analyses. Result Among the five ML models, the Xgboost (XGB) model demonstrated superior performance in predicting pCR, achieving an AUC of 1.000 (p < 0.001) on the internal data-set and an AUC of 0.950 (p = 0.001) on the external data-set.The XGB model effectively differentiated between high-risk and low-risk prognosis patients across all five centers: internal dataset (DFS, p = 0.002; OS, p = 0.004) and external dataset (DFS, p = 0.074; OS, p = 0.224).Furthermore, the COX regression demonstrated that the tumor length (HR = 1.230, 95%CI: 1.050–1.440, p = 0.010), post-NCRT CEA (HR = 1.716, 95%CI: 1.031– 2.858, p = 0.038), and XGB model score (HR = 0.128, 95%CI: 0.026–0.636, p = 0.012) were independent predictors of DFS after NCRT in the internal data-set.Using COX regression, the nomogram model and time-dependent AUC analysis demonstrated strong predictive discrimination for DFS in LARC patients across two independent institutions. Conclusion The ML model based on pathomics demonstrated effective prediction of pCR and prognosis in LARC patients. Further validation in larger cohorts is warranted to confirm the findings of this study.

Keywords