Cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical ViT

Jiarong Ye; Shivam Kalra; Mohammad Saleh Miri

doi:10.1038/s41598-024-53361-0

Scientific Reports (Feb 2024)

Cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical ViT

Jiarong Ye,
Shivam Kalra,
Mohammad Saleh Miri

Affiliations

Jiarong Ye: Roche Diagnostics Solutions
Shivam Kalra: Roche Diagnostics Solutions
Mohammad Saleh Miri: Roche Diagnostics Solutions

DOI: https://doi.org/10.1038/s41598-024-53361-0
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Developing a clinical AI model necessitates a significant amount of highly curated and carefully annotated dataset by multiple medical experts, which results in increased development time and costs. Self-supervised learning (SSL) is a method that enables AI models to leverage unlabelled data to acquire domain-specific background knowledge that can enhance their performance on various downstream tasks. In this work, we introduce CypherViT, a cluster-based histo-pathology phenotype representation learning by self-supervised multi-class-token hierarchical Vision Transformer (ViT). CypherViT is a novel backbone that can be integrated into a SSL pipeline, accommodating both coarse and fine-grained feature learning for histopathological images via a hierarchical feature agglomerative attention module with multiple classification (cls) tokens in ViT. Our qualitative analysis showcases that our approach successfully learns semantically meaningful regions of interest that align with morphological phenotypes. To validate the model, we utilize the DINO self-supervised learning (SSL) framework to train CypherViT on a substantial dataset of unlabeled breast cancer histopathological images. This trained model proves to be a generalizable and robust feature extractor for colorectal cancer images. Notably, our model demonstrates promising performance in patch-level tissue phenotyping tasks across four public datasets. The results from our quantitative experiments highlight significant advantages over existing state-of-the-art SSL models and traditional transfer learning methods, such as those relying on ImageNet pre-training.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal