Improved YOLOv7 Algorithm for Small Object Detection in Unmanned Aerial Vehicle Image Scenarios

Xinmin Li; Yingkun Wei; Jiahui Li; Wenwen Duan; Xiaoqiang Zhang; Yi Huang

doi:10.3390/app14041664

Applied Sciences (Feb 2024)

Improved YOLOv7 Algorithm for Small Object Detection in Unmanned Aerial Vehicle Image Scenarios

Xinmin Li,
Yingkun Wei,
Jiahui Li,
Wenwen Duan,
Xiaoqiang Zhang,
Yi Huang

Affiliations

Xinmin Li: College of Computer Science, Chengdu University, Chengdu 610100, China
Yingkun Wei: School of Information Engineering, Southwest University of Science and Technology, Mianyang 621000, China
Jiahui Li: School of Information Engineering, Southwest University of Science and Technology, Mianyang 621000, China
Wenwen Duan: School of Information Engineering, Southwest University of Science and Technology, Mianyang 621000, China
Xiaoqiang Zhang: School of Information Engineering, Southwest University of Science and Technology, Mianyang 621000, China
Yi Huang: Department of Information and Communication Engineering, Tongji University, Shanghai 201804, China

DOI: https://doi.org/10.3390/app14041664
Journal volume & issue: Vol. 14, no. 4
p. 1664

Abstract

Read online

Object detection in unmanned aerial vehicle (UAV) images has become a popular research topic in recent years. However, UAV images are captured from high altitudes with a large proportion of small objects and dense object regions, posing a significant challenge to small object detection. To solve this issue, we propose an efficient YOLOv7-UAV algorithm in which a low-level prediction head (P2) is added to detect small objects from the shallow feature map, and a deep-level prediction head (P5) is removed to reduce the effect of excessive down-sampling. Furthermore, we modify the bidirectional feature pyramid network (BiFPN) structure with a weighted cross-level connection to enhance the fusion effectiveness of multi-scale feature maps in UAV images. To mitigate the mismatch between the prediction box and ground-truth box, the SCYLLA-IoU (SIoU) function is employed in the regression loss to accelerate the training convergence process. Moreover, the proposed YOLOv7-UAV algorithm has been quantified and compiled in the Vitis-AI development environment and validated in terms of power consumption and hardware resources on the FPGA platform. The experiments show that the resource consumption of YOLOv7-UAV is reduced by 28%, the mAP is improved by 3.9% compared to YOLOv7, and the FPGA implementation improves the energy efficiency by 12 times compared to the GPU.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords