Applied Sciences (Jan 2024)

Unraveling a Histopathological Needle-in-Haystack Problem: Exploring the Challenges of Detecting Tumor Budding in Colorectal Carcinoma Histology

  • Daniel Rusche,
  • Nils Englert,
  • Marlen Runz,
  • Svetlana Hetjens,
  • Cord Langner,
  • Timo Gaiser,
  • Cleo-Aron Weis

DOI
https://doi.org/10.3390/app14020949
Journal volume & issue
Vol. 14, no. 2
p. 949

Abstract

Read online

Background: In this study focusing on colorectal carcinoma (CRC), we address the imperative task of predicting post-surgery treatment needs by identifying crucial tumor features within whole slide images of solid tumors, analogous to locating a needle in a histological haystack. We evaluate two approaches to address this challenge using a small CRC dataset. Methods: First, we explore a conventional tile-level training approach, testing various data augmentation methods to mitigate the memorization effect in a noisy label setting. Second, we examine a multi-instance learning (MIL) approach at the case level, adapting data augmentation techniques to prevent over-fitting in the limited data set context. Results: The tile-level approach proves ineffective due to the limited number of informative image tiles per case. Conversely, the MIL approach demonstrates success for the small dataset when coupled with post-feature vector creation data augmentation techniques. In this setting, the MIL model accurately predicts nodal status corresponding to expert-based budding scores for these cases. Conclusions: This study incorporates data augmentation techniques into a MIL approach, highlighting the effectiveness of the MIL method in detecting predictive factors such as tumor budding, despite the constraints of a limited dataset size.

Keywords