npj Precision Oncology (Jun 2023)

Bayesian risk prediction model for colorectal cancer mortality through integration of clinicopathologic and genomic data

  • Melissa Zhao,
  • Mai Chan Lau,
  • Koichiro Haruki,
  • Juha P. Väyrynen,
  • Carino Gurjao,
  • Sara A. Väyrynen,
  • Andressa Dias Costa,
  • Jennifer Borowsky,
  • Kenji Fujiyoshi,
  • Kota Arima,
  • Tsuyoshi Hamada,
  • Jochen K. Lennerz,
  • Charles S. Fuchs,
  • Reiko Nishihara,
  • Andrew T. Chan,
  • Kimmie Ng,
  • Xuehong Zhang,
  • Jeffrey A. Meyerhardt,
  • Mingyang Song,
  • Molin Wang,
  • Marios Giannakis,
  • Jonathan A. Nowak,
  • Kun-Hsing Yu,
  • Tomotaka Ugai,
  • Shuji Ogino

DOI
https://doi.org/10.1038/s41698-023-00406-8
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Routine tumor-node-metastasis (TNM) staging of colorectal cancer is imperfect in predicting survival due to tumor pathobiological heterogeneity and imprecise assessment of tumor spread. We leveraged Bayesian additive regression trees (BART), a statistical learning technique, to comprehensively analyze patient-specific tumor characteristics for the improvement of prognostic prediction. Of 75 clinicopathologic, immune, microbial, and genomic variables in 815 stage II–III patients within two U.S.-wide prospective cohort studies, the BART risk model identified seven stable survival predictors. Risk stratifications (low risk, intermediate risk, and high risk) based on model-predicted survival were statistically significant (hazard ratios 0.19–0.45, vs. higher risk; P < 0.0001) and could be externally validated using The Cancer Genome Atlas (TCGA) data (P = 0.0004). BART demonstrated model flexibility, interpretability, and comparable or superior performance to other machine-learning models. Integrated bioinformatic analyses using BART with tumor-specific factors can robustly stratify colorectal cancer patients into prognostic groups and be readily applied to clinical oncology practice.