IEEE Access (Jan 2025)
SLG-Net: Small-Large-Global Feature-Based Multilevel Feature Extraction Network for Ultrasound Image Segmentation
Abstract
Automatic ultrasound image segmentation improves the efficiency of clinical diagnosis and decreases the workload of doctors. Many ultrasound image segmentation methods only focus on capturing local details and global dependencies, whereas ignoring large-scale context information. However, it is essential to extract large-scale context features for large targets in images. To enhance the capability of feature extraction of the model for targets with various sizes and improve segmentation performance, we propose an effective multilevel feature extraction network (SLG-Net) which can extract features from local small details, large-scale context to global dependencies. The SLG-Net is parallel dual-encoder architecture which consists of a CNN encoder and a transformer encoder. Specifically, the CNN encoder improves the representation and interaction of fine feature and large-scale context feature for targets of different sizes by large-small kernel attention (LSKA) modules. The LSKA module firstly extracts features by parallel small kernel module and large-scale feature selection (LSFS) module. The extracted features from above modules are added for further information interaction through a following multi-scale feature interaction module. To fully leverage the feature extraction capability of large kernel convolutions and decrease the number of parameters, we design the large kernel decomposition module (LKDM) to extract large-scale context features in LSFS module. The transformer encoder is used to capture global features for compensating the limitations of CNN encoder. To merge multilevel features, a multi-scale feature fusion module is introduced after the dual-encoder. In addition, at the skip connection, a multi-scale attention module is integrated to retain significant shallow features for subsequent fusion of deep and shallow features. Experiments on three public ultrasound datasets indicate that the proposed network accomplishes the prominent performance for ultrasound image segmentation. It shows the potential of our study to promote intelligence in clinical medicine.
Keywords