YOLO‐RSFM: An efficient road small object detection method

Pei Tang; Zhenyu Ding; Mao Lv; Minnan Jiang; Weikai Xu

doi:10.1049/ipr2.13247

IET Image Processing (Nov 2024)

YOLO‐RSFM: An efficient road small object detection method

Pei Tang,
Zhenyu Ding,
Mao Lv,
Minnan Jiang,
Weikai Xu

Affiliations

Pei Tang: School of Automotive Engineering Yancheng Institute of Technology Yancheng China
Zhenyu Ding: School of Automotive Engineering Yancheng Institute of Technology Yancheng China
Mao Lv: School of Automotive Engineering Yancheng Institute of Technology Yancheng China
Minnan Jiang: School of Automotive Engineering Yancheng Institute of Technology Yancheng China
Weikai Xu: School of Automotive Engineering Yancheng Institute of Technology Yancheng China

DOI: https://doi.org/10.1049/ipr2.13247
Journal volume & issue: Vol. 18, no. 13
pp. 4263 – 4274

Abstract

Read online

Abstract To tackle challenges in road multi‐object detection, such as object occlusion, small object detection, and multi‐scale object detection difficulties, a new YOLOv8n‐RSFM structure is proposed. The key improvement of this structure lies in the introduction of the transformer decoder head, which optimizes the matching between the ground truth and predicted boxes, thereby effectively addressing issues of object overlap and multi‐scale detection. Additionally, a small object detection layer is incorporated to retain crucial information beneficial for detecting small objects, significantly improving the detection accuracy for small targets. To enhance learning capacity and reduce redundant computations, the FasterNet backbone is employed to replace CSPDarknet53, thus accelerating the training process. Finally, the INNER‐MPDIoU loss function is introduced to replace the original algorithm's complete IoU to accelerate the convergence and obtain more accurate regression results. A series of experiments were conducted on different datasets. The experimental results show that the proposed model YOLOv8N‐RSFM outperforms the original model YOLOv8n in small target detection. On the VisDrone, TinyPerson, and VSCrowd datasets, the mean accuracy percentage improved by 7.9%, 12.3%, and 4.5%, respectively.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords