Remote Sensing (Jul 2024)
Detection Based on Semantics and a Detail Infusion Feature Pyramid Network and a Coordinate Adaptive Spatial Feature Fusion Mechanism Remote Sensing Small Object Detector
Abstract
In response to the challenges of remote sensing imagery, such as unmanned aerial vehicle (UAV) aerial imagery, including differences in target dimensions, the dominance of small targets, and dense clutter and occlusion in complex environments, this paper optimizes the YOLOv8n model and proposes an innovative small-object-detection model called DDSC-YOLO. First, a DualC2f structure is introduced to improve the feature-extraction capabilities of the model. This structure uses dual-convolutions and group convolution techniques to effectively address the issues of cross-channel communication and preserving information in the original input feature mappings. Next, a new attention mechanism, DCNv3LKA, was developed. This mechanism uses adaptive and fine-grained information-extraction methods to simulate receptive fields similar to self-attention, allowing adaptation to a wide range of target size variations. To address the problem of false and missed detection of small targets in aerial photography, we designed a Semantics and Detail Infusion Feature Pyramid Network (SDI-FPN) and added a dedicated detection scale specifically for small targets, effectively mitigating the loss of contextual information in the model. In addition, the coordinate adaptive spatial feature fusion (CASFF) mechanism is used to optimize the original detection head, effectively overcoming multi-scale information conflicts while significantly improving small target localization accuracy and long-range dependency perception. Testing on the VisDrone2019 dataset shows that the DDSC-YOLO model improves the mAP0.5 by 9.3% over YOLOv8n, and its performance on the SSDD and RSOD datasets also confirms its superior generalization capabilities. These results confirm the effectiveness and significant progress of our novel approach to small target detection.
Keywords