Cancer Management and Research (Mar 2022)

Development and Validation of a New Multiparametric Random Survival Forest Predictive Model for Breast Cancer Recurrence with a Potential Benefit to Individual Outcomes

  • Li H,
  • Liu RB,
  • Long CM,
  • Teng Y,
  • Cheng L,
  • Liu Y

Journal volume & issue
Vol. Volume 14
pp. 909 – 923

Abstract

Read online

Huan Li,1,* Ren-Bin Liu,1,* Chen-Meng Long,2 Yuan Teng,3 Lin Cheng,1 Yu Liu1 1Department of Thyroid and Breast Surgery, Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong, People’s Republic of China; 2Department of Breast Surgery, Liuzhou Women and Children’s Medical Center, Liuzhou, Guangxi, People’s Republic Of China; 3Department of Breast Surgery, Guangzhou Women and Children’s Medical Center, Guangzhou, Guangdong, People’s Republic of China*These authors contributed equally to this workCorrespondence: Yu Liu, Tel +8613560170809, Fax +86 20 85252154, Email [email protected]: Breast cancer (BC) is a multi-factorial disease. Its individual prognosis varies; thus, individualized patient profiling is instrumental to improving BC management and individual outcomes. An economical, multiparametric, and practical model to predict BC recurrence is needed.Patients and Methods: We retrospectively investigated the clinical data of BC patients treated at the Third Affiliated Hospital of Sun Yat-sen University and Liuzhou Women and Children’s Medical Center from January 2013 to December 2020. Random forest-recursive feature elimination (run by R caret package) was used to determine the best variable set, and the random survival forest method was used to develop a predictive model for BC recurrence.Results: The training and validations sets included 623 and 151 patients, respectively. We selected 14 variables, the pathological (TNM) stage, gamma-glutamyl transpeptidase, total cholesterol, Ki-67, lymphocyte count, low-density lipoprotein, age, apolipoprotein B, high-density lipoprotein, globulin, neutrophil count to lymphocyte count ratio, alanine aminotransferase, triglyceride, and albumin to globulin ratio, using random survival forest (RSF)-recursive feature elimination. We developed a recurrence prediction model using RSF. Using area under the receiver operating characteristic curve and Kaplan–Meier survival analyses, the model performance was determined to be accurate. C-indexes were 0.997 and 0.936 for the training and validation sets, respectively.Conclusion: The model could accurately predict BC recurrence. It aids clinicians in identifying high-risk patients and making treatment decisions for Breast cancer patients in China. This new multiparametric RSF model is instrumental for breast cancer recurrence prediction and potentially improves individual outcomes.Keywords: breast cancer, random survival forest, recurrence, individualized patient profiles, multi-level diagnostics and disease modeling

Keywords