IEEE Access (Jan 2024)

Multi-TranResUnet: An Improved Transformer Network for Solving Multi-Scale Issues in Image Segmentation

  • Yajing Kang,
  • Shuai Cheng,
  • Liang Guo,
  • Chao Zheng,
  • Jizhuang Zhao

DOI
https://doi.org/10.1109/ACCESS.2024.3457823
Journal volume & issue
Vol. 12
pp. 129000 – 129011

Abstract

Read online

Deep-learning-driven medical image segmentation marks a significant milestone in the evolution of intelligent healthcare systems. Despite remarkable accuracy achievements, real-world clinical applications still grapple with complex challenges, particularly in handling multi-scale medical targets. This paper introduces a novel and efficient medical image segmentation network that leverages Transformer technology. The proposed network utilizes the Transformer’s global feature extraction capabilities, enriched with spatial context, to substantially elevate segmentation accuracy. Additionally, the fusion encoder we build by combining Transformer modules and Convolutional structures through feature fusion strategies can improve feature extraction capabilities. Acknowledging the computational demands of Transformer models in practical scenarios, we have meticulously optimized our Transformer architecture. This optimization focuses on reducing parameter complexity and inference latency, tailoring the model to address the typical sample scarcity in medical applications. We evaluated our model on two different medical datasets: the 2018 Lesion Boundary Segmentation Challenge, the 2018 Data Science Bowl Challenge and the Kvasir-Instrument dataset. Our model demonstrates state-of-the-art performance in both Dice and MIoU metrics, while maintaining robust real-time processing capabilities. Our code will be released at https://github.com/migouKang/Multi-TranResUnet.

Keywords