SDMSEAF-YOLOv8: a framework to significantly improve the detection performance of unmanned aerial vehicle images

Linxuan Li; Xiaoyu Liu; Xuan Chen; Fengjuan Yin; Bin Chen; Yufeng Wang; Fanbin Meng

doi:10.1080/10106049.2024.2339294

Geocarto International (Jan 2024)

SDMSEAF-YOLOv8: a framework to significantly improve the detection performance of unmanned aerial vehicle images

Linxuan Li,
Xiaoyu Liu,
Xuan Chen,
Fengjuan Yin,
Bin Chen,
Yufeng Wang,
Fanbin Meng

Affiliations

Linxuan Li: School of Medical Information Engineering, Jining Medical University, Rizhao, China
Xiaoyu Liu: School of Medical Information Engineering, Jining Medical University, Rizhao, China
Xuan Chen: School of Medical Information Engineering, Jining Medical University, Rizhao, China
Fengjuan Yin: School of Medical Information Engineering, Jining Medical University, Rizhao, China
Bin Chen: School of Mechanical and Electronic Engineering, Shandong Agriculture and Engineering University, Zibo, China
Yufeng Wang: School of Medical Information Engineering, Jining Medical University, Rizhao, China
Fanbin Meng: School of Medical Information Engineering, Jining Medical University, Rizhao, China

DOI: https://doi.org/10.1080/10106049.2024.2339294
Journal volume & issue: Vol. 39, no. 1

Abstract

Read online

AbstractThe detailed, high-resolution images captured by drones pose challenges to target detection algorithms with complex scenes and small-sized targets. Moreover, targets in unmanned aerial vehicle images are usually affected by factors such as viewing perspective, occlusion, and light, which increase the difficulty of target detection. In response to the above issues, we propose an improved SDMSEAF-YOLOv8 for target detection based on YOLOv8, combined with a Bi-directional Feature Pyramid Network, to improve the sensing ability of the model for multiscale targets. A Space-to-depth layer replaces the traditional strided convolution layer to enhance the extraction of fine-grained information and small-sized target features. A Multi-Separated and Enhancement Attention module enhances the feature learning ability of the occluded target region, thus reducing missed and false detections. Four detection heads are employed for tiny target detection, each responsible for different size ranges, so as to improve the accuracy and robustness of small target detection. The conventional non-maximum suppression algorithm is improved, so as to reduce the problem of missed detections under a densely occluded scene by setting the attenuation function to adjust the confidence of the treated box based on the overlap between it and the highest-scoring box. Experiments demonstrate that the accuracy of SDMSEAF-YOLOv8 exceeds that of state-of-the-art models on the VisDrone2019-DET-val dataset, with a mAP of 42.9% at 640-pixel resolution, 14.8% over the baseline YOLOv8-x algorithm model, and 6.0% over the known state-of-the-art Fine-Grained Target Focusing Network model and with twice as fast detection.

Published in Geocarto International

ISSN: 1010-6049 (Print); 1752-0762 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Physical geography
Website: https://www.tandfonline.com/tgei

About the journal

Abstract

Keywords