YOLOAL: Focusing on the Object Location for Detection on Drone Imagery

Xinting Chen; Wenzhu Yang; Shuang Zeng; Lei Geng; Yanyan Jiao

doi:10.1109/access.2023.3332815

IEEE Access (Jan 2023)

YOLOAL: Focusing on the Object Location for Detection on Drone Imagery

Xinting Chen,
Wenzhu Yang,
Shuang Zeng,
Lei Geng,
Yanyan Jiao

Affiliations

Xinting Chen: ORCiD; School of Cyber Security and Computer, Hebei University, Baoding, China
Wenzhu Yang: School of Cyber Security and Computer, Hebei University, Baoding, China
Shuang Zeng: School of Cyber Security and Computer, Hebei University, Baoding, China
Lei Geng: School of Cyber Security and Computer, Hebei University, Baoding, China
Yanyan Jiao: School of Cyber Security and Computer, Hebei University, Baoding, China

DOI: https://doi.org/10.1109/access.2023.3332815
Journal volume & issue: Vol. 11
pp. 128886 – 128897

Abstract

Read online

Object detection in drone-captured scenarios, which can be considered as a task of detecting dense small objects, is still a challenge. Drones navigate at different altitudes, causing significant changes in the size of the detected objects and posing a challenge to the model. Additionally, it is necessary to improve the ability of the object detection model to rapidly detect small dense objects. To address these issues, we propose YOLOAL, a model that emphasizes the location information of the objects. It incorporates a new attention mechanism called the Convolution and Coordinate Attention Module (CCAM) into its design. This mechanism performs better than traditional ones in dense small object scenes because it adds coordinates that help identify attention regions in such scenarios. Furthermore, our model uses a new loss function combined with the Efficient IoU (EIoU) and Alpha-IoU methods that achieve better results than the traditional approaches. The proposed model achieved state-of-the-art performance on the VisDrone and DOTA datasets. YOLOAL reaches an AP50 (average accuracy when Intersection over Union threshold is 0.5) of 63.6% and an mAP (average of 10 IoU thresholds, ranging from 0.5 to 0.95) of 40.8% at a real-time speed of 0.27 seconds on the VisDrone dataset, and the mAP on the DOTA dataset even reaches 39% on an NVIDIA A4000.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords