BMC Cancer (Apr 2024)

Integrated clinical and genomic models using machine-learning methods to predict the efficacy of paclitaxel-based chemotherapy in patients with advanced gastric cancer

  • Yonghwa Choi,
  • Jangwoo Lee,
  • Keewon Shin,
  • Ji Won Lee,
  • Ju Won Kim,
  • Soohyeon Lee,
  • Yoon Ji Choi,
  • Kyong Hwa Park,
  • Jwa Hoon Kim

DOI
https://doi.org/10.1186/s12885-024-12268-9
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Paclitaxel is commonly used as a second-line therapy for advanced gastric cancer (AGC). The decision to proceed with second-line chemotherapy and select an appropriate regimen is critical for vulnerable patients with AGC progressing after first-line chemotherapy. However, no predictive biomarkers exist to identify patients with AGC who would benefit from paclitaxel-based chemotherapy. Methods This study included 288 patients with AGC receiving second-line paclitaxel-based chemotherapy between 2017 and 2022 as part of the K-MASTER project, a nationwide government-funded precision medicine initiative. The data included clinical (age [young-onset vs. others], sex, histology [intestinal vs. diffuse type], prior trastuzumab use, duration of first-line chemotherapy), and genomic factors (pathogenic or likely pathogenic variants). Data were randomly divided into training and validation sets (0.8:0.2). Four machine learning (ML) methods, namely random forest (RF), logistic regression (LR), artificial neural network (ANN), and ANN with genetic embedding (ANN with GE), were used to develop the prediction model and validated in the validation sets. Results The median patient age was 64 years (range 25–91), and 65.6% of those were male. A total of 288 patients were divided into the training (n = 230) and validation (n = 58) sets. No significant differences existed in baseline characteristics between the training and validation sets. In the training set, the areas under the ROC curves (AUROC) for predicting better progression-free survival (PFS) with paclitaxel-based chemotherapy were 0.499, 0.679, 0.618, and 0.732 in the RF, LR, ANN, and ANN with GE models, respectively. The ANN with the GE model that achieved the highest AUROC recorded accuracy, sensitivity, specificity, and F1-score performance of 0.458, 0.912, 0.724, and 0.579, respectively. In the validation set, the ANN with GE model predicted that paclitaxel-sensitive patients had significantly longer PFS (median PFS 7.59 vs. 2.07 months, P = 0.020) and overall survival (OS) (median OS 14.70 vs. 7.50 months, P = 0.008). The LR model predicted that paclitaxel-sensitive patients showed a trend for longer PFS (median PFS 6.48 vs. 2.33 months, P = 0.078) and OS (median OS 12.20 vs. 8.61 months, P = 0.099). Conclusions These ML models, integrated with clinical and genomic factors, offer the possibility to help identify patients with AGC who may benefit from paclitaxel chemotherapy.

Keywords