Scientific Reports (Apr 2024)
Machine learning-based survival prediction nomogram for postoperative parotid mucoepidermoid carcinoma
Abstract
Abstract Parotid mucoepidermoid carcinoma (P-MEC) is a significant histopathological subtype of salivary gland cancer with inherent heterogeneity and complexity. Existing clinical models inadequately offer personalized treatment options for patients. In response, we assessed the efficacy of four machine learning algorithms vis-à-vis traditional analysis in forecasting the overall survival (OS) of P-MEC patients. Using the SEER database, we analyzed data from 882 postoperative P-MEC patients (stages I–IVA). Single-factor Cox regression and four machine learning techniques (random forest, LASSO, XGBoost, best subset regression) were employed for variable selection. The optimal model was derived via stepwise backward regression, Akaike Information Criterion (AIC), and Area Under the Curve (AUC). Bootstrap resampling facilitated internal validation, while prediction accuracy was gauged through C-index, time-dependent ROC curve, and calibration curve. The model’s clinical relevance was ascertained using decision curve analysis (DCA). The study found 3-, 5-, and 10-year OS rates of 0.887, 0.841, and 0.753, respectively. XGBoost, BSR, and LASSO stood out in predictive efficacy, identifying seven key prognostic factors including age, pathological grade, T stage, N stage, radiation therapy, chemotherapy, and marital status. A subsequent nomogram revealed a C-index of 0.8499 (3-year), 0.8557 (5-year), and 0.8375 (10-year) and AUC values of 0.8670, 0.8879, and 0.8767, respectively. The model also highlighted the clinical significance of postoperative radiotherapy across varying risk levels. Our prognostic model, grounded in machine learning, surpasses traditional models in prediction and offer superior visualization of variable importance.
Keywords