Frontiers in Surgery (Mar 2023)

A machine learning-based model for predicting the risk of early-stage inguinal lymph node metastases in patients with squamous cell carcinoma of the penis

  • Li Ding,
  • Chi Zhang,
  • Kun Wang,
  • Yang Zhang,
  • Chuang Wu,
  • Wentao Xia,
  • Shuaishuai Li,
  • Wang Li,
  • Junqi Wang

DOI
https://doi.org/10.3389/fsurg.2023.1095545
Journal volume & issue
Vol. 10

Abstract

Read online

ObjectiveInguinal lymph node metastasis (ILNM) is significantly associated with poor prognosis in patients with squamous cell carcinoma of the penis (SCCP). Patient prognosis could be improved if the probability of ILNM incidence could be accurately predicted at an early stage. We developed a predictive model based on machine learning combined with big data to achieve this.MethodsData of patients diagnosed with SCCP were obtained from the Surveillance, Epidemiology, and End Results Program Research Data. By combing variables that represented the patients' clinical characteristics, we applied five machine learning algorithms to create predictive models based on logistic regression, eXtreme Gradient Boosting, Random Forest, Support Vector Machine, and k-Nearest Neighbor. Model performance was evaluated by ten-fold cross-validation receiver operating characteristic curves, which were used to calculate the area under the curve of the five models for predictive accuracy. Decision curve analysis was conducted to estimate the clinical utility of the models. An external validation cohort of 74 SCCP patients was selected from the Affiliated Hospital of Xuzhou Medical University (February 2008 to March 2021).ResultsA total of 1,056 patients with SCCP from the SEER database were enrolled as the training cohort, of which 164 (15.5%) developed early-stage ILNM. In the external validation cohort, 16.2% of patients developed early-stage ILNM. Multivariate logistic regression showed that tumor grade, inguinal lymph node dissection, radiotherapy, and chemotherapy were independent predictors of early-stage ILNM risk. The model based on the eXtreme Gradient Boosting algorithm showed stable and efficient prediction performance in both the training and external validation groups.ConclusionThe ML model based on the XGB algorithm has high predictive effectiveness and may be used to predict early-stage ILNM risk in SCCP patients. Therefore, it may show promise in clinical decision-making.

Keywords