Random survival forest algorithm for risk stratification and survival prediction in gastric neuroendocrine neoplasms

Tianbao Liao; Tingting Su; Yang Lu; Lina Huang; Wei‑Yuan Wei; Lu-Huai Feng

doi:10.1038/s41598-024-77988-1

Scientific Reports (Nov 2024)

Random survival forest algorithm for risk stratification and survival prediction in gastric neuroendocrine neoplasms

Tianbao Liao,
Tingting Su,
Yang Lu,
Lina Huang,
Wei‑Yuan Wei,
Lu-Huai Feng

Affiliations

Tianbao Liao: Department of President’s Office, Youjiang Medical University for Nationalities
Tingting Su: Department of ECG Diagnostics, The People’s Hospital of Guangxi Zhuang Autonomous Region
Yang Lu: Department of International Medical, The Affiliated Tumor Hospital of Guangxi Medical University
Lina Huang: Department of Endocrinology and Metabolism Nephrology, The Affiliated Tumor Hospital of Guangxi Medical University
Wei‑Yuan Wei: Department of Gastric and Abdominal Tumor Surgery, The Affiliated Tumor Hospital of Guangxi Medical University
Lu-Huai Feng: Department of Endocrinology and Metabolism Nephrology, The Affiliated Tumor Hospital of Guangxi Medical University

DOI: https://doi.org/10.1038/s41598-024-77988-1
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 11

Abstract

Read online

Abstract This study aimed to construct and assess a machine-learning algorithm designed to forecast survival rates and risk stratification for patients with gastric neuroendocrine neoplasms (gNENs) after diagnosis. Data on patients with gNENs were extracted and randomly divided into training and validation sets using the Surveillance, Epidemiology, and End Results database. We developed a prediction model using 10 machine learning algorithms across 101 combinations to forecast cancer-related mortality in patients with gNENs, selecting the best model using the highest mean over a sequence of time-dependent area under the receiver operating characteristic (ROC) curve (AUC). The performance of the final model was assessed through time-dependent ROC curves for discrimination and calibration curves for calibration. The maximum selection rank method was used to determine the best prognostic risk score threshold for classifying patients into high- and low-risk groups. Afterward, Kaplan–Meier analysis and log-rank test were used to compare survival rates among these groups. Our study examined 775 patients with gNENs, dividing them into training and validation sets. A training set comprised 543 patients, with a median follow-up of 42 months and cumulative mortality rates of 40.0% at 1 year, 48.6% at 3 years, and 54.0% at 5 years. A validation set comprised 232 patients, with cumulative mortality rates of 29.1% at 1 year, 43.5% at 3 years, and 53.2% at 5 years. The optimal random survival forest (RSF) model (mtry = 4, node size = 5) achieved an AUC of 0.839 for survival prediction in the training set. Comprising 11 variables such as demographics, treatment details, tumor characteristics, T staging, N staging, and M staging, the RSF model revealed high predictive accuracy with AUCs of 0.92, 0.96, and 0.96 for 1-, 3-, and 5-year survival, respectively, which was consistently reflected in the validation set with AUCs of 0.88, 0.92, and 0.89, respectively. Moreover, patients were risk-stratified. Although our RSF model effectively stratified patients into different prognostic groups, it needs external validation to confirm its utility for noninvasive prognostic prediction and risk stratification in gNENs. Further research is required to verify its broader clinical applicability.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords