SUTrans-NET: a hybrid transformer approach to skin lesion segmentation

Yaqin Li; Tonghe Tian; Jing Hu; Cao Yuan

doi:10.7717/peerj-cs.1935

PeerJ Computer Science (Mar 2024)

SUTrans-NET: a hybrid transformer approach to skin lesion segmentation

Yaqin Li,
Tonghe Tian,
Jing Hu,
Cao Yuan

Affiliations

Yaqin Li: School of Mathematics and Computer Science, Wuhan Polytechnic University School, Wuhan, Hubei, China
Tonghe Tian: School of Mathematics and Computer Science, Wuhan Polytechnic University School, Wuhan, Hubei, China
Jing Hu: School of Mathematics and Computer Science, Wuhan Polytechnic University School, Wuhan, Hubei, China
Cao Yuan: School of Mathematics and Computer Science, Wuhan Polytechnic University School, Wuhan, Hubei, China

DOI: https://doi.org/10.7717/peerj-cs.1935
Journal volume & issue: Vol. 10
p. e1935

Abstract

Read online Read online

Melanoma is a malignant skin tumor that threatens human life and health. Early detection is essential for effective treatment. However, the low contrast between melanoma lesions and normal skin and the irregularity in size and shape make skin lesions difficult to detect with the naked eye in the early stages, making the task of skin lesion segmentation challenging. Traditional encoder-decoder built with U-shaped networks using convolutional neural network (CNN) networks have limitations in establishing long-term dependencies and global contextual connections, while the Transformer architecture is limited in its application to small medical datasets. To address these issues, we propose a new skin lesion segmentation network, SUTrans-NET, which combines CNN and Transformer in a parallel fashion to form a dual encoder, where both CNN and Transformer branches perform dynamic interactive fusion of image information in each layer. At the same time, we introduce our designed multi-grouping module SpatialGroupAttention (SGA) to complement the spatial and texture information of the Transformer branch, and utilize the Focus idea of YOLOV5 to construct the Patch Embedding module in the Transformer to prevent the loss of pixel accuracy. In addition, we design a decoder with full-scale information fusion capability to fully fuse shallow and deep features at different stages of the encoder. The effectiveness of our method is demonstrated on the ISIC 2016, ISIC 2017, ISIC 2018 and PH2 datasets and its advantages over existing methods are verified.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords