T-YOLO: Tiny Vehicle Detection Based on YOLO and Multi-Scale Convolutional Neural Networks

Daniel Padilla Carrasco; Hatem A. Rashwan; Miguel Angel Garcia; Domenec Puig

doi:10.1109/ACCESS.2021.3137638

IEEE Access (Jan 2023)

T-YOLO: Tiny Vehicle Detection Based on YOLO and Multi-Scale Convolutional Neural Networks

Daniel Padilla Carrasco,
Hatem A. Rashwan,
Miguel Angel Garcia,
Domenec Puig

Affiliations

Daniel Padilla Carrasco: ORCiD; Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Tarragona, Spain
Hatem A. Rashwan: ORCiD; Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Tarragona, Spain
Miguel Angel Garcia: ORCiD; Department of Electronic and Communications Technology, Universidad Autónoma de Madrid, Madrid, Spain
Domenec Puig: Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Tarragona, Spain

DOI: https://doi.org/10.1109/ACCESS.2021.3137638
Journal volume & issue: Vol. 11
pp. 22430 – 22440

Abstract

Read online

To solve real-life problems for different smart city applications, using deep Neural Network, such as parking occupancy detection, requires fine-tuning of these networks. For large parking, it is desirable to use a cenital-plane camera located at a high distance that allows the monitoring of the entire parking space or a large parking area with only one camera. Today’s most popular object detection models, such as YOLO, achieve good precision scores at real-time speed. However, if we use our own data different from that of the general-purpose datasets, such as COCO and ImageNet, we have a large margin for improvisation. In this paper, we propose a modified, yet lightweight, deep object detection model based on the YOLO-v5 architecture. The proposed model can detect large, small, and tiny objects. Specifically, we propose the use of a multi-scale mechanism to learn deep discriminative feature representations at different scales and automatically determine the most suitable scales for detecting objects in a scene (i.e., in our case vehicles). The proposed multi-scale module reduces the number of trainable parameters compared to the original YOLO-v5 architecture. The experimental results also demonstrate that precision is improved by a large margin. In fact, as shown in the experiments, the results show a small reduction from 7.28 million parameters of the YOLO-v5-S profile to 7.26 million parameters in our model. In addition, we reduced the detection speed by inferring 30 fps compared to the YOLO-v5-L/X profiles. In addition, the tiny vehicle detection performance was significantly improved by 33% compared to the YOLO-v5-X profile.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords