IET Image Processing (Nov 2024)
YOLO‐RSFM: An efficient road small object detection method
Abstract
Abstract To tackle challenges in road multi‐object detection, such as object occlusion, small object detection, and multi‐scale object detection difficulties, a new YOLOv8n‐RSFM structure is proposed. The key improvement of this structure lies in the introduction of the transformer decoder head, which optimizes the matching between the ground truth and predicted boxes, thereby effectively addressing issues of object overlap and multi‐scale detection. Additionally, a small object detection layer is incorporated to retain crucial information beneficial for detecting small objects, significantly improving the detection accuracy for small targets. To enhance learning capacity and reduce redundant computations, the FasterNet backbone is employed to replace CSPDarknet53, thus accelerating the training process. Finally, the INNER‐MPDIoU loss function is introduced to replace the original algorithm's complete IoU to accelerate the convergence and obtain more accurate regression results. A series of experiments were conducted on different datasets. The experimental results show that the proposed model YOLOv8N‐RSFM outperforms the original model YOLOv8n in small target detection. On the VisDrone, TinyPerson, and VSCrowd datasets, the mean accuracy percentage improved by 7.9%, 12.3%, and 4.5%, respectively.
Keywords