Frontiers in Medicine (Jul 2023)

CNN stability training improves robustness to scanner and IHC-based image variability for epithelium segmentation in cervical histology

  • Felipe Miranda Ruiz,
  • Felipe Miranda Ruiz,
  • Bernd Lahrmann,
  • Bernd Lahrmann,
  • Liam Bartels,
  • Liam Bartels,
  • Alexandra Krauthoff,
  • Alexandra Krauthoff,
  • Andreas Keil,
  • Andreas Keil,
  • Steffen Härtel,
  • Amy S. Tao,
  • Philipp Ströbel,
  • Megan A. Clarke,
  • Nicolas Wentzensen,
  • Niels Grabe,
  • Niels Grabe,
  • Niels Grabe

DOI
https://doi.org/10.3389/fmed.2023.1173616
Journal volume & issue
Vol. 10

Abstract

Read online

BackgroundIn digital pathology, image properties such as color, brightness, contrast and blurriness may vary based on the scanner and sample preparation. Convolutional Neural Networks (CNNs) are sensitive to these variations and may underperform on images from a different domain than the one used for training. Robustness to these image property variations is required to enable the use of deep learning in clinical practice and large scale clinical research.AimsCNN Stability Training (CST) is proposed and evaluated as a method to increase CNN robustness to scanner and Immunohistochemistry (IHC)-based image variability.MethodsCST was applied to segment epithelium in immunohistological cervical Whole Slide Images (WSIs). CST randomly distorts input tiles and factors the difference between the CNN prediction for the original and distorted inputs within the loss function. CNNs were trained using 114 p16-stained WSIs from the same scanner, and evaluated on 6 WSI test sets, each with 23 to 24 WSIs of the same tissue but different scanner/IHC combinations. Relative robustness (rAUC) was measured as the difference between the AUC on the training domain test set (i.e., baseline test set) and the remaining test sets.ResultsAcross all test sets, The AUC of CST models outperformed “No CST” models (AUC: 0.940–0.989 vs. 0.905–0.986, p < 1e − 8), and obtained an improved robustness (rAUC: [−0.038, −0.003] vs. [−0.081, −0.002]). At a WSI level, CST models showed an increase in performance in 124 of the 142 WSIs. CST models also outperformed models trained with random on-the-fly data augmentation (DA) in all test sets ([0.002, 0.021], p < 1e-6).ConclusionCST offers a path to improve CNN performance without the need for more data and allows customizing distortions to specific use cases. A python implementation of CST is publicly available at https://github.com/TIGACenter/CST_v1.

Keywords