Frontiers in Oncology (Jan 2025)
Machine learning for predicting neoadjuvant chemotherapy effectiveness using ultrasound radiomics features and routine clinical data of patients with breast cancer
Abstract
BackgroundThis study explores the clinical value of a machine learning (ML) model based on ultrasound radiomics features of primary foci, combined with clinicopathologic factors to predict the pathological complete response (pCR) of neoadjuvant chemotherapy (NAC) for patients with breast cancer (BC).MethodWe retrospectively analyzed ultrasound images and clinical information from 231 participants with BC who received NAC. These patients were randomly assigned to training and validation cohorts. Tumor regions of interest (ROI) were delineated, and radiomics features were extracted. Z-score normalization, Pearson correlation analysis, and the least absolute shrinkage selection operator (LASSO) were utilized for further screening ultrasound radiomics and clinical features. Univariate and multivariate logistic regression analysis were performed to identify the CFs that were independently associated with pCR. We compared 10 ML models based on radiomics features: support vector machine (SVM), logistic regression (LR), random forest, extra trees (ET), naïve Bayes (NB), k-nearest neighbor (KNN), multilayer perceptron (MLP), gradient boosting ML (GBM), light GBM (LGBM), and adaptive boost (AB). Diagnostic performance was evaluated using the receiver operating characteristic (ROC) area under the curve (AUC), accuracy, sensitivity, and specificity, and the Rad score was calculated. Subsequently, construction of clinical predictive models and Rad score joint clinical predictive models using ML algorithms for optimal diagnostic performance. The diagnostic process of the ML model was visualized and analyzed using SHapley Additive exPlanation (SHAP).ResultsOut of 231 participants with BC, 98 (42.42%) achieved pCR, and 133 (57.58%) did not. Twelve radiomics features were identified, with the GBM model demonstrating the best predictive performance (AUC of 0.851, accuracy of 0.75, sensitivity of 0.821, and specificity of 0.698). The clinical feature prediction model using the GBM algorithm had an AUC of 0.819 and an accuracy of 0.739. Combining the Rad score with clinical features in the GBM model resulted in superior predictive performance (AUC of 0.939 and an accuracy of 0.87). SHAP analysis indicated that participants with a high Rad score, PR-negative, ER-negative and human epidermal growth factor receptor-2 (HER-2) positive were more possibly to reach pCR. Based on the decision curve analysis, it was shown that the combined model of GBM provided higher clinical benefits.ConclusionThe GBM model based on ultrasound radiomics features and routine clinical date of BC patients had high performance in predicting pCR. SHAP analysis provided a clear explanation for the prediction results of the GBM model, revealing that patients with a high Rad score, PR-negative status, ER-negative status and HER-2-positive status are more likely to achieve pCR.
Keywords