Scientific Reports (Jan 2025)
SED-YOLO based multi-scale attention for small object detection in remote sensing
Abstract
Abstract Object detection is crucial for remote sensing image processing, yet the detection of small objects remains highly challenging due to factors such as image noise and cluttered backgrounds. In response to this challenge, this paper proposes an improved network, named SED-YOLO, based on YOLOv5s. Firstly, we leverage Switchable Atrous Convolution (SAC) to replace the standard convolutions in the original C3 modules of the backbone network, thereby enhancing feature extraction capabilities and adaptability. Additionally, we introduce the Efficient Multi-Scale Attention(EMA) mechanism at the end of the backbone network to enable efficient multi-scale feature learning, which reduces computational costs while preserving crucial information. In the Neck section, an adaptive Concat method is designed to dynamically adjust the feature fusion strategy according to image content and object characteristics, strengthening the model’s ability to handle diverse objects. Lastly, the three-scale feature detection head is expanded to four by adding a small object detection layer, and incorporating the Dynamic Head(DyHead) module. This enhances the detection head’s expressive power by dynamically adjusting attention weights in feature maps. Experimental results demonstrate that this improved network achieves an mean Average Precision (mAP) of 71.6% on the DOTA dataset, surpassing the original YOLOv5s by 2.4%, effectively improving the accuracy of small object detection.
Keywords