Dual-attention transformer-based hybrid network for multi-modal medical image segmentation

Menghui Zhang; Yuchen Zhang; Shuaibing Liu; Yahui Han; Honggang Cao; Bingbing Qiao

doi:10.1038/s41598-024-76234-y

Scientific Reports (Oct 2024)

Dual-attention transformer-based hybrid network for multi-modal medical image segmentation

Menghui Zhang,
Yuchen Zhang,
Shuaibing Liu,
Yahui Han,
Honggang Cao,
Bingbing Qiao

Affiliations

Menghui Zhang: Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University
Yuchen Zhang: Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University
Shuaibing Liu: Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University
Yahui Han: Department of Pediatric Surgery, The First Affiliated Hospital of Zhengzhou University
Honggang Cao: Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University
Bingbing Qiao: Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University

DOI: https://doi.org/10.1038/s41598-024-76234-y
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 22

Abstract

Read online

Abstract Accurate medical image segmentation plays a vital role in clinical practice. Convolutional Neural Network and Transformer are mainstream architectures for this task. However, convolutional neural network lacks the ability of modeling global dependency while Transformer cannot extract local details. In this paper, we propose DATTNet, D ual ATT ention Net work, an encoder-decoder deep learning model for medical image segmentation. DATTNet is exploited in hierarchical fashion with two novel components: (1) Dual Attention module is designed to model global dependency in spatial and channel dimensions. (2) Context Fusion Bridge is presented to remix the feature maps with multiple scales and construct their correlations. The experiments on ACDC, Synapse and Kvasir-SEG datasets are conducted to evaluate the performance of DATTNet. Our proposed model shows superior performance, effectiveness and robustness compared to SOTA methods, with mean Dice Similarity Coefficient scores of 92.2%, 84.5% and 89.1% on cardiac, abdominal organs and gastrointestinal poly segmentation tasks. The quantitative and qualitative results demonstrate that our proposed DATTNet attains favorable capability across different modalities (MRI, CT, and endoscopy) and can be generalized to various tasks. Therefore, it is envisaged as being potential for practicable clinical applications. The code has been released on https://github.com/MhZhang123/DATTNet/tree/main .

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords