Drones (Nov 2024)
Mamba-UAV-SegNet: A Multi-Scale Adaptive Feature Fusion Network for Real-Time Semantic Segmentation of UAV Aerial Imagery
Abstract
Accurate semantic segmentation of high-resolution images captured by unmanned aerial vehicles (UAVs) is crucial for applications in environmental monitoring, urban planning, and precision agriculture. However, challenges such as class imbalance, small-object detection, and intricate boundary details complicate the analysis of UAV imagery. To address these issues, we propose Mamba-UAV-SegNet, a novel real-time semantic segmentation network specifically designed for UAV images. The network integrates a Multi-Head Mamba Block (MH-Mamba Block) for enhanced multi-scale feature representation, an Adaptive Boundary Enhancement Fusion Module (ABEFM) for improved boundary-aware feature fusion, and an edge-detail auxiliary training branch to capture fine-grained details. The practical utility of our method is demonstrated through its application to farmland segmentation. Extensive experiments on the UAV-City, VDD, and UAVid datasets show that our model outperforms state-of-the-art methods, achieving mean Intersection over Union (mIoU) scores of 71.2%, 77.5%, and 69.3%, respectively. Ablation studies confirm the effectiveness of each component and their combined contributions to overall performance. The proposed method balances segmentation accuracy and computational efficiency, maintaining real-time inference speeds suitable for practical UAV applications.
Keywords