IEEE Access (Jan 2024)
MFCANet: Multiscale Feature Context Aggregation Network for Oriented Object Detection in Remote-Sensing Images
Abstract
Rotated object detection in remote sensing images presents a highly challenging task due to the extensive fields of view and complex backgrounds. While Convolutional Neural Networks (CNNs) and Transformer networks have made progress in this area, there is still a lack of research on extracting and fusing features for small targets in complex backgrounds. To address this gap, we have extended the RTMDet framework by introducing three modules: the Focused Feature Context Aggregation Module, the Feature Context Information Enhancement Module, and the Multi-scale Feature Fusion Module. In the Focused Feature Context Aggregation Module, we replaced the Spatial Pyramid Pooling Bottleneck (SPPFBottleneck) to better extract small target features by focusing on contextual information. The Feature Context Information Enhancement Module enhances the model’s perception of multi-dimensional temporal and spatial information. Finally, we combined the original features with the fused ones to prevent the loss of specific features during the fusion process. Our proposed model, named the Multi-scale Feature Context Aggregation Network (MFCANet), was evaluated on four challenging remote sensing datasets (MAR20, SRSDD, HRSC, and DIOR-R). The experimental results demonstrate that our method outperforms baseline models, achieving improvements of 2.13%, 10.28%, 1.46%, and 1.13% in mAP for the MAR20, SRSDD, HRSC, and DIOR-R datasets, respectively.
Keywords