Sensors (Jan 2023)

A Two-Step Feature Selection Radiomic Approach to Predict Molecular Outcomes in Breast Cancer

  • Valentina Brancato,
  • Nadia Brancati,
  • Giusy Esposito,
  • Massimo La Rosa,
  • Carlo Cavaliere,
  • Ciro Allarà,
  • Valeria Romeo,
  • Giuseppe De Pietro,
  • Marco Salvatore,
  • Marco Aiello,
  • Mara Sangiovanni

DOI
https://doi.org/10.3390/s23031552
Journal volume & issue
Vol. 23, no. 3
p. 1552

Abstract

Read online

Breast Cancer (BC) is the most common cancer among women worldwide and is characterized by intra- and inter-tumor heterogeneity that strongly contributes towards its poor prognosis. The Estrogen Receptor (ER), Progesterone Receptor (PR), Human Epidermal Growth Factor Receptor 2 (HER2), and Ki67 antigen are the most examined markers depicting BC heterogeneity and have been shown to have a strong impact on BC prognosis. Radiomics can noninvasively predict BC heterogeneity through the quantitative evaluation of medical images, such as Magnetic Resonance Imaging (MRI), which has become increasingly important in the detection and characterization of BC. However, the lack of comprehensive BC datasets in terms of molecular outcomes and MRI modalities, and the absence of a general methodology to build and compare feature selection approaches and predictive models, limit the routine use of radiomics in the BC clinical practice. In this work, a new radiomic approach based on a two-step feature selection process was proposed to build predictors for ER, PR, HER2, and Ki67 markers. An in-house dataset was used, containing 92 multiparametric MRIs of patients with histologically proven BC and all four relevant biomarkers available. Thousands of radiomic features were extracted from post-contrast and subtracted Dynamic Contrast-Enanched (DCE) MRI images, Apparent Diffusion Coefficient (ADC) maps, and T2-weighted (T2) images. The two-step feature selection approach was used to identify significant radiomic features properly and then to build the final prediction models. They showed remarkable results in terms of F1-score for all the biomarkers: 84%, 63%, 90%, and 72% for ER, HER2, Ki67, and PR, respectively. When possible, the models were validated on the TCGA/TCIA Breast Cancer dataset, returning promising results (F1-score = 88% for the ER+/ER− classification task). The developed approach efficiently characterized BC heterogeneity according to the examined molecular biomarkers.

Keywords