Journal of King Saud University: Computer and Information Sciences (Sep 2024)
SWFormer: A scale-wise hybrid CNN-Transformer network for multi-classes weed segmentation
Abstract
Weeds in rapeseed field are an important factor in crop yield reduction and economic loss. Thus, Precision Agriculture is an important task for sustainable agriculture and weed management. At present, deep learning techniques have shown great potential for image-based detection and classification in various crops and weeds. However, the inherent limitations of traditional convolutional neural networks pose significant challenges due to the locally similarity of weeds and crops in color, shape and texture. To address this issue, we introduce SWFormer, a scale-wise hybrid CNN-Transformer network. SWFormer leverages the distinct strengths of both convolutional and transformer architectures. Convolutional structures excel at extracting short-range dependency information among pixels, whereas transformer structures are adept at capturing global dependency relationships. Additionally, we propose two innovative modules. Firstly, the Scale-wise Cascade Convolution (SWCC) module is designed to capture multiscale features and expand the receptive field. Secondly, the Adaptive Semantic Aggregation (ASA) module facilitates adaptive and effective information fusion across two distinct feature maps. Our experiments were conducted on the publicly available cropandweed dataset and SB20 dataset. it yields improved performance over other mainstream segmentation models. Specifically, SWFormer with 52.33M/527.51GFLOPs achieves an mAP of 76.54% and an accuracy of 83.95% on the cropandweed dataset. For the SB20 dataset, it attains an mAP of 61.24% and an accuracy of 79.47%. Overall, the evaluation clearly demonstrates our proposed SWFormer is conducive to promoting further research in the area of Precision Agriculture.