Computational and Structural Biotechnology Journal (Jan 2021)

HistoClean: Open-source software for histological image pre-processing and augmentation to improve development of robust convolutional neural networks

  • Kris D. McCombe,
  • Stephanie G. Craig,
  • Amélie Viratham Pulsawatdi,
  • Javier I. Quezada-Marín,
  • Matthew Hagan,
  • Simon Rajendran,
  • Matthew P. Humphries,
  • Victoria Bingham,
  • Manuel Salto-Tellez,
  • Richard Gault,
  • Jacqueline A. James

Journal volume & issue
Vol. 19
pp. 4840 – 4853

Abstract

Read online

The growth of digital pathology over the past decade has opened new research pathways and insights in cancer prediction and prognosis. In particular, there has been a surge in deep learning and computer vision techniques to analyse digital images. Common practice in this area is to use image pre-processing and augmentation to prevent bias and overfitting, creating a more robust deep learning model. This generally requires consultation of documentation for multiple coding libraries, as well as trial and error to ensure that the techniques used on the images are appropriate. Herein we introduce HistoClean; a user-friendly, graphical user interface that brings together multiple image processing modules into one easy to use toolkit.HistoClean is an application that aims to help bridge the knowledge gap between pathologists, biomedical scientists and computer scientists by providing transparent image augmentation and pre-processing techniques which can be applied without prior coding knowledge.In this study, we utilise HistoClean to pre-process images for a simple convolutional neural network used to detect stromal maturity, improving the accuracy of the model at a tile, region of interest, and patient level. This study demonstrates how HistoClean can be used to improve a standard deep learning workflow via classical image augmentation and pre-processing techniques, even with a relatively simple convolutional neural network architecture. HistoClean is free and open-source and can be downloaded from the Github repository here: https://github.com/HistoCleanQUB/HistoClean.

Keywords