Drones (Jul 2023)
Multi-Branch Parallel Networks for Object Detection in High-Resolution UAV Remote Sensing Images
Abstract
Uncrewed Aerial Vehicles (UAVs) are instrumental in advancing the field of remote sensing. Nevertheless, the complexity of the background and the dense distribution of objects both present considerable challenges for object detection in UAV remote sensing images. This paper proposes a Multi-Branch Parallel Network (MBPN) based on the ViTDet (Visual Transformer for Object Detection) model, which aims to improve object detection accuracy in UAV remote sensing images. Initially, the discriminative ability of the input feature map of the Feature Pyramid Network (FPN) is improved by incorporating the Receptive Field Enhancement (RFE) and Convolutional Self-Attention (CSA) modules. Subsequently, to mitigate the loss of semantic information, the sampling process of the FPN is replaced by Multi-Branch Upsampling (MBUS) and Multi-Branch Downsampling (MBDS) modules. Lastly, a Feature-Concatenating Fusion (FCF) module is employed to merge feature maps of varying levels, thereby addressing the issue of semantic misalignment. This paper evaluates the performance of the proposed model on both a custom UAV-captured WCH dataset and the publicly available NWPU VHR10 dataset. The experimental results demonstrate that the proposed model achieves an increase in APL of 2.4% and 0.7% on the WCH and NWPU VHR10 datasets, respectively, compared to the baseline model ViTDet-B.
Keywords