IEEE Access (Jan 2020)

Detection of Breast Cancer From Whole Slide Histopathological Images Using Deep Multiple Instance CNN

  • Kausik Das,
  • Sailesh Conjeti,
  • Jyotirmoy Chatterjee,
  • Debdoot Sheet

DOI
https://doi.org/10.1109/ACCESS.2020.3040106
Journal volume & issue
Vol. 8
pp. 213502 – 213511

Abstract

Read online

Histopathological Whole Slide Imaging (WSI) has become a standard in the detection of breast cancer. Automated image analysis methods attempt to reduce the workload from the clinicians and Convolutional Neural Networks (CNNs) are a popular choice for this purpose. However, size of a WSI image typically is approximately $40,000\times 40.000$ pixels (can reach up to $100,000\times 100.000$ pixels). CNNs cannot handle such large images. Moreover, downscaling a WSI image causes degradation of small-scale visual information. Hence, a large number of small patches (containing critical visual information) from a WSI image are extracted by a trained pathologist and are used for training. However, it requires massive amounts of time to precisely search and label appropriate class-representative patches. To address this issue, a Deep Multiple Instance Learning (MIL) based CNN framework has been introduced in this paper. In the proposed framework every slide is represented as a bag of extracted patches. Only the bag label is used for training, thus eliminating the requirement to provide patchwise labels. The patches inherit the label of the bag containing them. A WSI image (i.e. a bag) is labeled benign if all its patches are benign and labeled malignant even if a single patch contains malignant cells. Learning can be carried out at the bag level even with noisy patch labels. Performance of this method was evaluated using the BreakHis, IUPHL and UCSB breast cancer datasets where 93.06%, 96.63%, 95.83% accuracy was achieved respectively.

Keywords