Scientific Reports (Sep 2024)

Developing machine learning models for personalized treatment strategies in early breast cancer patients undergoing neoadjuvant systemic therapy based on SEER database

  • Jiahui Ren,
  • Yili Li,
  • Jing Zhou,
  • Ting Yang,
  • Jingfeng Jing,
  • Qian Xiao,
  • Zhongxu Duan,
  • Ke Xiang,
  • Yuchen Zhuang,
  • Daxue Li,
  • Han Gao

DOI
https://doi.org/10.1038/s41598-024-72385-0
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract This study aimed to compare the long-term outcomes of breast-conserving surgery plus radiotherapy (BCS + RT) and mastectomy in early breast cancer (EBC) patients who received neoadjuvant systemic therapy (NST), and sought to construct and authenticate a machine learning algorithm that could assist healthcare professionals in formulating personalized treatment strategies for this patient population. We analyzed data from the Surveillance, Epidemiology, and End Results database on EBC patients undergoing BCS + RT or mastectomy post-NST (2010–2018). Employing propensity score matching (PSM) to minimize potential biases, we compared breast cancer-specific survival (BCSS) and overall survival (OS) between the two surgical groups. Additionally, we trained and validated six machine learning survival models and developed a cloud-based recommendation system for surgical treatment based on the optimal model. Among the 13,958 patients, 9028 (64.7%) underwent BCS + RT and 4930 (35.3%) underwent mastectomy. After PSM, there were 3715 patients in each group. Compared to mastectomy, BCS + RT significantly improved BCSS (p < 0.001) and OS (p < 0.001). Prognostic variables associated with BCSS were utilized to develop machine learning models. In both the training and validation cohorts, the random survival forest (RSF) model demonstrated superior predictive performance (0.847 and 0.795), not only outperforming other machine learning models, including Rpart (0.725 and 0.707), Xgboost (0.762 and 0.727), Glmboost (0.748 and 0.788), Survctree (0.764 and 0.766), and Survsvm (0.777 and 0.790), but also outperforming the classical COX model (0.749 and 0.782). Lastly, a web-based prediction tool was built to facilitate clinical application [ https://jhren.shinyapps.io/shinyapp1 ]. After adjusting other confounders, BCS + RT was associated with improved outcomes in patients with EBC after NST, compared to those who underwent mastectomy. Moreover, the RSF model, a reliable tool, can predict long-term outcomes for patients, providing valuable guidance for operative methods and postoperative follow-up.

Keywords