Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks

Icaro O. De Oliveira; Rayson Laroca; David Menotti; Keiko Veronica Ono Fonseca; Rodrigo Minetto

doi:10.1109/ACCESS.2021.3097964

IEEE Access (Jan 2021)

Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks

Icaro O. De Oliveira,
Rayson Laroca,
David Menotti,
Keiko Veronica Ono Fonseca,
Rodrigo Minetto

Affiliations

Icaro O. De Oliveira: ORCiD; Post-Graduate Program in Electrical and Computer Engineering, Federal University of Technology-Paraná, Curitiba, Brazil
Rayson Laroca: ORCiD; Department of Informatics, Federal University of Paraná, Curitiba, Brazil
David Menotti: ORCiD; Department of Informatics, Federal University of Paraná, Curitiba, Brazil
Keiko Veronica Ono Fonseca: ORCiD; Post-Graduate Program in Electrical and Computer Engineering, Federal University of Technology-Paraná, Curitiba, Brazil
Rodrigo Minetto: ORCiD; Post-Graduate Program in Electrical and Computer Engineering, Federal University of Technology-Paraná, Curitiba, Brazil

DOI: https://doi.org/10.1109/ACCESS.2021.3097964
Journal volume & issue: Vol. 9
pp. 101065 – 101077

Abstract

Read online

This work addresses the problem of vehicle identification through non-overlapping cameras. As our main contribution, we introduce a novel dataset for vehicle identification, called Vehicle-Rear, that contains more than three hours of high-resolution videos, with accurate information about the make, model, color and year of nearly 3,000 vehicles, in addition to the position and identification of their license plates. To explore our dataset we design a two-stream Convolutional Neural Network (CNN) that simultaneously uses two of the most distinctive and persistent features available: the vehicle’s appearance and its license plate. This is an attempt to tackle a major problem: false alarms caused by vehicles with similar designs or by very close license plate identifiers. In the first network stream, shape similarities are identified by a Siamese CNN that uses a pair of low-resolution vehicle patches recorded by two different cameras. In the second stream, we use a CNN for Optical Character Recognition (OCR) to extract textual information, confidence scores, and string similarities from a pair of high-resolution license plate patches. Then, features from both streams are merged by a sequence of fully connected layers for decision. In our experiments, we compared the two-stream network against several well-known CNN architectures using single or multiple vehicle features. The architectures, trained models, and dataset are publicly available at https://github.com/icarofua/vehicle-rear.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords