MC-ViT: Multi-path cross-scale vision transformer for thymoma histopathology whole slide image typing

Huaqi Zhang; Huang Chen; Jin Qin; Bei Wang; Guolin Ma; Pengyu Wang; Dingrong Zhong; Jie Liu

doi:10.3389/fonc.2022.925903

Frontiers in Oncology (Oct 2022)

MC-ViT: Multi-path cross-scale vision transformer for thymoma histopathology whole slide image typing

Huaqi Zhang,
Huang Chen,
Jin Qin,
Bei Wang,
Guolin Ma,
Pengyu Wang,
Dingrong Zhong,
Jie Liu

Affiliations

Huaqi Zhang: School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Huang Chen: Department of Pathology, China-Japan Friendship Hospital, Beijing, China
Jin Qin: School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Bei Wang: Department of Pathology, China-Japan Friendship Hospital, Beijing, China
Guolin Ma: Department of Radiology, China-Japan Friendship Hospital, Beijing, China
Pengyu Wang: School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China
Dingrong Zhong: Department of Pathology, China-Japan Friendship Hospital, Beijing, China
Jie Liu: School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China

DOI: https://doi.org/10.3389/fonc.2022.925903
Journal volume & issue: Vol. 12

Abstract

Read online

ObjectivesAccurate histological typing plays an important role in diagnosing thymoma or thymic carcinoma (TC) and predicting the corresponding prognosis. In this paper, we develop and validate a deep learning-based thymoma typing method for hematoxylin & eosin (H&E)-stained whole slide images (WSIs), which provides useful histopathology information from patients to assist doctors for better diagnosing thymoma or TC.MethodsWe propose a multi-path cross-scale vision transformer (MC-ViT), which first uses the cross attentive scale-aware transformer (CAST) to classify the pathological information related to thymoma, and then uses such pathological information priors to assist the WSIs transformer (WT) for thymoma typing. To make full use of the multi-scale (10×, 20×, and 40×) information inherent in a WSI, CAST not only employs parallel multi-path to capture different receptive field features from multi-scale WSI inputs, but also introduces the cross-correlation attention module (CAM) to aggregate multi-scale features to achieve cross-scale spatial information complementarity. After that, WT can effectively convert full-scale WSIs into 1D feature matrices with pathological information labels to improve the efficiency and accuracy of thymoma typing.ResultsWe construct a large-scale thymoma histopathology WSI (THW) dataset and annotate corresponding pathological information and thymoma typing labels. The proposed MC-ViT achieves the Top-1 accuracy of 0.939 and 0.951 in pathological information classification and thymoma typing, respectively. Moreover, the quantitative and statistical experiments on the THW dataset also demonstrate that our pipeline performs favorably against the existing classical convolutional neural networks, vision transformers, and deep learning-based medical image classification methods.ConclusionThis paper demonstrates that comprehensively utilizing the pathological information contained in multi-scale WSIs is feasible for thymoma typing and achieves clinically acceptable performance. Specifically, the proposed MC-ViT can well predict pathological information classes as well as thymoma types, which show the application potential to the diagnosis of thymoma and TC and may assist doctors in improving diagnosis efficiency and accuracy.

Published in Frontiers in Oncology

ISSN: 2234-943X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Internal medicine: Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Website: https://www.frontiersin.org/journals/oncology/

About the journal

Abstract

Keywords