Heliyon (Sep 2024)

Improving performance in colorectal cancer histology decomposition using deep and ensemble machine learning

  • Fabi Prezja,
  • Leevi Annala,
  • Sampsa Kiiskinen,
  • Suvi Lahtinen,
  • Timo Ojala,
  • Pekka Ruusuvuori,
  • Teijo Kuopio

Journal volume & issue
Vol. 10, no. 18
p. e37561

Abstract

Read online

In routine colorectal cancer management, histologic samples stained with hematoxylin and eosin are commonly used. Nonetheless, their potential for defining objective biomarkers for patient stratification and treatment selection is still being explored. The current gold standard relies on expensive and time-consuming genetic tests. However, recent research highlights the potential of convolutional neural networks (CNNs) to facilitate the extraction of clinically relevant biomarkers from these readily available images. These CNN-based biomarkers can predict patient outcomes comparably to golden standards, with the added advantages of speed, automation, and minimal cost. The predictive potential of CNN-based biomarkers fundamentally relies on the ability of CNNs to accurately classify diverse tissue types from whole slide microscope images. Consequently, enhancing the accuracy of tissue class decomposition is critical to amplifying the prognostic potential of imaging-based biomarkers. This study introduces a hybrid deep transfer learning and ensemble machine learning model that improves upon previous approaches, including a transformer and neural architecture search baseline for this task. We employed a pairing of the EfficientNetV2 architecture with a random forest classification head. Our model achieved 96.74% accuracy (95% CI: 96.3%-97.1%) on the external test set and 99.89% on the internal test set. Recognizing the potential of these models in the task, we have made them publicly available.

Keywords