Frontiers in Oncology (Apr 2022)

Breast Cancer Molecular Subtype Prediction on Pathological Images with Discriminative Patch Selection and Multi-Instance Learning

  • Hong Liu,
  • Wen-Dong Xu,
  • Wen-Dong Xu,
  • Zi-Hao Shang,
  • Zi-Hao Shang,
  • Xiang-Dong Wang,
  • Hai-Yan Zhou,
  • Ke-Wen Ma,
  • Huan Zhou,
  • Jia-Lin Qi,
  • Jia-Rui Jiang,
  • Li-Lan Tan,
  • Hui-Min Zeng,
  • Hui-Juan Cai,
  • Kuan-Song Wang,
  • Kuan-Song Wang,
  • Yue-Liang Qian

DOI
https://doi.org/10.3389/fonc.2022.858453
Journal volume & issue
Vol. 12

Abstract

Read online

Molecular subtypes of breast cancer are important references to personalized clinical treatment. For cost and labor savings, only one of the patient’s paraffin blocks is usually selected for subsequent immunohistochemistry (IHC) to obtain molecular subtypes. Inevitable block sampling error is risky due to the tumor heterogeneity and could result in a delay in treatment. Molecular subtype prediction from conventional H&E pathological whole slide images (WSI) using the AI method is useful and critical to assist pathologists to pre-screen proper paraffin block for IHC. It is a challenging task since only WSI-level labels of molecular subtypes from IHC can be obtained without detailed local region information. Gigapixel WSIs are divided into a huge amount of patches to be computationally feasible for deep learning, while with coarse slide-level labels, patch-based methods may suffer from abundant noise patches, such as folds, overstained regions, or non-tumor tissues. A weakly supervised learning framework based on discriminative patch selection and multi-instance learning was proposed for breast cancer molecular subtype prediction from H&E WSIs. Firstly, co-teaching strategy using two networks was adopted to learn molecular subtype representations and filter out some noise patches. Then, a balanced sampling strategy was used to handle the imbalance in subtypes in the dataset. In addition, a noise patch filtering algorithm that used local outlier factor based on cluster centers was proposed to further select discriminative patches. Finally, a loss function integrating local patch with global slide constraint information was used to fine-tune MIL framework on obtained discriminative patches and further improve the prediction performance of molecular subtyping. The experimental results confirmed the effectiveness of the proposed AI method and our models outperformed even senior pathologists, which has the potential to assist pathologists to pre-screen paraffin blocks for IHC in clinic.

Keywords