PVswin-YOLOv8s: UAV-Based Pedestrian and Vehicle Detection for Traffic Management in Smart Cities Using Improved YOLOv8

Noor Ul Ain Tahir; Zhe Long; Zuping Zhang; Muhammad Asim; Mohammed ELAffendi

doi:10.3390/drones8030084

Drones (Feb 2024)

PVswin-YOLOv8s: UAV-Based Pedestrian and Vehicle Detection for Traffic Management in Smart Cities Using Improved YOLOv8

Noor Ul Ain Tahir,
Zhe Long,
Zuping Zhang,
Muhammad Asim,
Mohammed ELAffendi

Affiliations

Noor Ul Ain Tahir: School of Computer Science and Engineering, Central South University, Changsha 410083, China
Zhe Long: School of Computer Science and Engineering, Central South University, Changsha 410083, China
Zuping Zhang: School of Computer Science and Engineering, Central South University, Changsha 410083, China
Muhammad Asim: EIAS Data Science and Blockchain Laboratory, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
Mohammed ELAffendi: EIAS Data Science and Blockchain Laboratory, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia

DOI: https://doi.org/10.3390/drones8030084
Journal volume & issue: Vol. 8, no. 3
p. 84

Abstract

Read online

In smart cities, effective traffic congestion management hinges on adept pedestrian and vehicle detection. Unmanned Aerial Vehicles (UAVs) offer a solution with mobility, cost-effectiveness, and a wide field of view, and yet, optimizing recognition models is crucial to surmounting challenges posed by small and occluded objects. To address these issues, we utilize the YOLOv8s model and a Swin Transformer block and introduce the PVswin-YOLOv8s model for pedestrian and vehicle detection based on UAVs. Firstly, the backbone network of YOLOv8s incorporates the Swin Transformer model for global feature extraction for small object detection. Secondly, to address the challenge of missed detections, we opt to integrate the CBAM into the neck of the YOLOv8. Both the channel and the spatial attention modules are used in this addition because of how well they extract feature information flow across the network. Finally, we employ Soft-NMS to improve the accuracy of pedestrian and vehicle detection in occlusion situations. Soft-NMS increases performance and manages overlapped boundary boxes well. The proposed network reduced the fraction of small objects overlooked and enhanced model detection performance. Performance comparisons with different YOLO versions ( for example YOLOv3 extremely small, YOLOv5, YOLOv6, and YOLOv7), YOLOv8 variants (YOLOv8n, YOLOv8s, YOLOv8m, and YOLOv8l), and classical object detectors (Faster-RCNN, Cascade R-CNN, RetinaNet, and CenterNet) were used to validate the superiority of the proposed PVswin-YOLOv8s model. The efficiency of the PVswin-YOLOv8s model was confirmed by the experimental findings, which showed a 4.8% increase in average detection accuracy (mAP) compared to YOLOv8s on the VisDrone2019 dataset.

Published in Drones

ISSN: 2504-446X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Motor vehicles. Aeronautics. Astronautics
Website: http://www.mdpi.com/journal/drones

About the journal

Abstract

Keywords