PeerJ (Jul 2024)

Comparison of radiomics-based machine-learning classifiers for the pretreatment prediction of pathologic complete response to neoadjuvant therapy in breast cancer

  • Xue Li,
  • Chunmei Li,
  • Hong Wang,
  • Lei Jiang,
  • Min Chen

DOI
https://doi.org/10.7717/peerj.17683
Journal volume & issue
Vol. 12
p. e17683

Abstract

Read online Read online

Background Machine learning classifiers are increasingly used to create predictive models for pathological complete response (pCR) in breast cancer after neoadjuvant therapy (NAT). Few studies have compared the effectiveness of different ML classifiers. This study evaluated radiomics models based on pre- and post-contrast first-phase T1 weighted images (T1WI) in predicting breast cancer pCR after NAT and compared the performance of ML classifiers. Methods This retrospective study enrolled 281 patients undergoing NAT from the Duke-Breast-Cancer-MRI dataset. Radiomic features were extracted from pre- and post-contrast first-phase T1WI images. The Synthetic Minority Oversampling Technique (SMOTE) was applied, then the dataset was randomly divided into training and validation groups (7:3). The radiomics model was built using selected optimal features. Support vector machine (SVM), random forest (RF), decision tree (DT), k-nearest neighbor (KNN), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) were classifiers. Receiver operating characteristic curves were used to assess predictive performance. Results LightGBM performed best in predicting pCR [area under the curve (AUC): 0.823, 95% confidence interval (CI) [0.743–0.902], accuracy 74.0%, sensitivity 85.0%, specificity 67.2%]. During subgroup analysis, RF was most effective in pCR prediction in luminal breast cancers (AUC: 0.914, 95% CI [0.847–0.981], accuracy 87.0%, sensitivity 85.2%, specificity 88.1%). In triple-negative breast cancers, LightGBM performed best (AUC: 0.836, 95% CI [0.708–0.965], accuracy 78.6%, sensitivity 68.2%, specificity 90.0%). Conclusion The LightGBM-based radiomics model performed best in predicting pCR in patients with breast cancer. RF and LightGBM showed promising results for luminal and triple-negative breast cancers, respectively.

Keywords