Agriculture (Dec 2024)
Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery
Abstract
The accurate mapping of weeds in agricultural fields is essential for effective weed control and enhanced crop productivity. Moving beyond the limitations of RGB imagery alone, this study presents a cross-modal feature fusion network (CMFNet) designed for precise weed mapping by integrating RGB and near-infrared (NIR) imagery. CMFNet first applies color space enhancement and adaptive histogram equalization to improve the image brightness and contrast in both RGB and NIR images. Building on a Transformer-based segmentation framework, a cross-modal multi-scale feature enhancement module is then introduced, featuring spatial and channel feature interaction to automatically capture complementary information across two modalities. The enhanced features are further fused and refined by integrating an attention mechanism, which reduces the background interference and enhances the segmentation accuracy. Extensive experiments conducted on two public datasets, the Sugar Beets 2016 and Sunflower datasets, demonstrate that CMFNet significantly outperforms CNN-based segmentation models in the task of weed and crop segmentation. The model achieved an Intersection over Union (IoU) metric of 90.86% and 90.77%, along with a Mean Accuracy (mAcc) of 93.8% and 94.35%, respectively. Ablation studies further validate that the proposed cross-modal fusion method provides substantial improvements over basic feature fusion methods, effectively localizing weed and crop regions across diverse field conditions. These findings underscore their potential as a robust solution for precise and adaptive weed mapping in complex agricultural landscapes.
Keywords