UNestFormer: Enhancing Decoders and Skip Connections With Nested Transformers for Medical Image Segmentation

Adnan Md Tayeb; Tae-Hyong Kim

doi:10.1109/ACCESS.2024.3516079

IEEE Access (Jan 2024)

UNestFormer: Enhancing Decoders and Skip Connections With Nested Transformers for Medical Image Segmentation

Adnan Md Tayeb,
Tae-Hyong Kim

Affiliations

Adnan Md Tayeb: Kumoh National Institute of Technology, Gumi-si, South Korea
Tae-Hyong Kim: ORCiD; Kumoh National Institute of Technology, Gumi-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3516079
Journal volume & issue: Vol. 12
pp. 190996 – 191009

Abstract

Read online

Precise identification of organs and lesions in medical images is essential for accurate disease diagnosis and analysis of organ structures. Deep convolutional neural network (CNN)-based U-shaped networks are among the most popular and promising approaches for this task. Recently, full-Transformer or hybrid CNN-Transformer structures have gained traction in medical image segmentation due to their effectiveness. However, current approaches face at least one of three limitations: 1) CNN-based models struggle to capture long-range dependencies, 2) Transformer-based models have limitations in extracting local features, resulting in a loss of low-level details, and 3) hybrid models are computationally complex. In this paper, we propose UNestFormer, a novel CNN-Transformer hybrid framework where densely connected nested transformers link the encoder and decoder to achieve more precise segmentation results by reducing the semantic gap between them. The traditional transformer, which uses the multi-head self-attention (MHSA), primarily addresses global attention but misses other forms of attention. In contrast, we designed an omni-attention, which incorporates four forms of attention: local, global, channel, and spatial, and introduced the omni-attention transformer block (OmniBlock). UNestFormer is designed to be comparatively lightweight yet robust and accurate. We argue that nested transformers with the proposed OmniBlock serve as strong decoders with efficient skip connections for medical image segmentation, enhancing feature aggregation and minimizing information loss. Extensive experiments validate UNestFormer’s superiority across benchmark medical datasets. It outperforms its closest competitors by 2.1% on Synapse and 1.29% on ISIC 2018 in terms of the Dice similarity coefficient (DSC), and achieves a 1.45% improvement in the Hausdorff distance (HD) on Synapse, all while maintaining lower computational costs.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords