BMC Cancer (Dec 2024)

Prediction of gene expression-based breast cancer proliferation scores from histopathology whole slide images using deep learning

  • Andreas Ekholm,
  • Yinxi Wang,
  • Johan Vallon-Christersson,
  • Constance Boissin,
  • Mattias Rantalainen

DOI
https://doi.org/10.1186/s12885-024-13248-9
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background In breast cancer, several gene expression assays have been developed to provide a more personalised treatment. This study focuses on the prediction of two molecular proliferation signatures: an 11-gene proliferation score and the MKI67 proliferation marker gene. The aim was to assess whether these could be predicted from digital whole slide images (WSIs) using deep learning models. Methods WSIs and RNA-sequencing data from 819 invasive breast cancer patients were included for training, and models were evaluated on an internal test set of 172 cases as well as on 997 cases from a fully independent external test set. Two deep Convolutional Neural Network (CNN) models were optimised using WSIs and gene expression readouts from RNA-sequencing data of either the proliferation signature or the proliferation marker, and assessed using Spearman correlation (r). Prognostic performance was assessed through Cox proportional hazard modelling, estimating hazard ratios (HR). Results Optimised CNNs successfully predicted the proliferation score and proliferation marker on the unseen internal test set (ρ = 0.691(p < 0.001) with R2 = 0.438, and ρ = 0.564 (p < 0.001) with R2 = 0.251 respectively) and on the external test set (ρ = 0.502 (p < 0.001) with R2 = 0.319, and ρ = 0.403 (p < 0.001) with R2 = 0.222 respectively). Patients with a high proliferation score or marker were significantly associated with a higher risk of recurrence or death in the external test set (HR = 1.65 (95% CI: 1.05–2.61) and HR = 1.84 (95% CI: 1.17–2.89), respectively). Conclusions The results from this study suggest that gene expression levels of proliferation scores can be predicted directly from breast cancer morphology in WSIs using CNNs and that the predictions provide prognostic information that could be used in research as well as in the clinical setting.

Keywords