Scientific Reports (Oct 2024)

Prediction model for survival of younger patients with breast cancer using the breast cancer public staging database

  • Ha Ye Jin Kang,
  • Minsam Ko,
  • Kwang Sun Ryu

DOI
https://doi.org/10.1038/s41598-024-76331-y
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Breast cancer (BC) is a major contributor to female mortality worldwide, particularly in young women with aggressive tumors. Despite the need for accurate prognosis in this demographic, existing studies primarily focus on broader age groups, often using the SEER database, which has limitations in variable selection. This study aimed to develop an ML-based model to predict survival outcomes in young BC patients using the BC public staging database. A total of 3,401 patients with BC were included in the study. Patients were categorized as younger (n = 1574) and older (n = 1827). We applied several survival models—Random Survival Forest, Gradient Boosting Survival, Extra Survival Trees (EST), and penalized Cox models (Lasso and ElasticNet)—to compare mortality characteristics. The EST model outperformed others in predicting mortality for both age groups. Older patients exhibited a higher prevalence of comorbidities compared to younger patients. Tumor stage was the primary variable used to train the model for mortality prediction in both groups. COPD was a significant variable only in younger patients with BC. Other variables exhibited varying degrees of consistency in each group. These findings can help identify high-risk young female patients with BC who require aggressive treatment by predicting the risk of mortality.

Keywords