A lighter hybrid feature fusion framework for polyp segmentation

He Xue; Luo Yonggang; Liu Min; Li Lin

doi:10.1038/s41598-024-72763-8

Scientific Reports (Oct 2024)

A lighter hybrid feature fusion framework for polyp segmentation

He Xue,
Luo Yonggang,
Liu Min,
Li Lin

Affiliations

He Xue: Department of Anesthesia Surgery, The Affiliated Huaian No.1 People’s Hospital of Nanjing Medical University
Luo Yonggang: Department of Cardiothoracic Surgery, The Affiliated Huaian No.1 People’s Hospital of Nanjing Medical University
Liu Min: Department of Laboratory Medicine, The Affiliated Huaian No.1 People’s Hospital of Nanjing Medical University
Li Lin: Department of Anesthesia Surgery, The Affiliated Huaian No.1 People’s Hospital of Nanjing Medical University

DOI: https://doi.org/10.1038/s41598-024-72763-8
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Colonoscopy is widely recognized as the most effective method for the detection of colon polyps, which is crucial for early screening of colorectal cancer. Polyp identification and segmentation in colonoscopy images require specialized medical knowledge and are often labor-intensive and expensive. Deep learning provides an intelligent and efficient approach for polyp segmentation. However, the variability in polyp size and the heterogeneity of polyp boundaries and interiors pose challenges for accurate segmentation. Currently, Transformer-based methods have become a mainstream trend for polyp segmentation. However, these methods tend to overlook local details due to the inherent characteristics of Transformer, leading to inferior results. Moreover, the computational burden brought by self-attention mechanisms hinders the practical application of these models. To address these issues, we propose a novel CNN-Transformer hybrid model for polyp segmentation (CTHP). CTHP combines the strengths of CNN, which excels at modeling local information, and Transformer, which excels at modeling global semantics, to enhance segmentation accuracy. We transform the self-attention computation over the entire feature map into the width and height directions, significantly improving computational efficiency. Additionally, we design a new information propagation module and introduce additional positional bias coefficients during the attention computation process, which reduces the dispersal of information introduced by deep and mixed feature fusion in the Transformer. Extensive experimental results demonstrate that our proposed model achieves state-of-the-art performance on multiple benchmark datasets for polyp segmentation. Furthermore, cross-domain generalization experiments show that our model exhibits excellent generalization performance.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords