Artificial Intelligence for Text-Based Vehicle Search, Recognition, and Continuous Localization in Traffic Videos

Karen Panetta; Landry Kezebou; Victor Oludare; James Intriligator; Sos Agaian

doi:10.3390/ai2040041

AI (Dec 2021)

Artificial Intelligence for Text-Based Vehicle Search, Recognition, and Continuous Localization in Traffic Videos

Karen Panetta,
Landry Kezebou,
Victor Oludare,
James Intriligator,
Sos Agaian

Affiliations

Karen Panetta: Department of Electrical & Computer Engineering, School of Engineering, Tufts University, Medford, MA 02155, USA
Landry Kezebou: Department of Electrical & Computer Engineering, School of Engineering, Tufts University, Medford, MA 02155, USA
Victor Oludare: Department of Electrical & Computer Engineering, School of Engineering, Tufts University, Medford, MA 02155, USA
James Intriligator: Department of Electrical & Computer Engineering, School of Engineering, Tufts University, Medford, MA 02155, USA
Sos Agaian: Department of Computer Science, School of Engineering, City University of New York (CUNY), New York, NY 10031, USA

DOI: https://doi.org/10.3390/ai2040041
Journal volume & issue: Vol. 2, no. 4
pp. 684 – 704

Abstract

Read online

The concept of searching and localizing vehicles from live traffic videos based on descriptive textual input has yet to be explored in the scholarly literature. Endowing Intelligent Transportation Systems (ITS) with such a capability could help solve crimes on roadways. One major impediment to the advancement of fine-grain vehicle recognition models is the lack of video testbench datasets with annotated ground truth data. Additionally, to the best of our knowledge, no metrics currently exist for evaluating the robustness and performance efficiency of a vehicle recognition model on live videos and even less so for vehicle search and localization models. In this paper, we address these challenges by proposing V-Localize, a novel artificial intelligence framework for vehicle search and continuous localization captured from live traffic videos based on input textual descriptions. An efficient hashgraph algorithm is introduced to compute valid target information from textual input. This work further introduces two novel datasets to advance AI research in these challenging areas. These datasets include (a) the most diverse and large-scale Vehicle Color Recognition (VCoR) dataset with 15 color classes—twice as many as the number of color classes in the largest existing such dataset—to facilitate finer-grain recognition with color information; and (b) a Vehicle Recognition in Video (VRiV) dataset, a first of its kind video testbench dataset for evaluating the performance of vehicle recognition models in live videos rather than still image data. The VRiV dataset will open new avenues for AI researchers to investigate innovative approaches that were previously intractable due to the lack of annotated traffic vehicle recognition video testbench dataset. Finally, to address the gap in the field, five novel metrics are introduced in this paper for adequately accessing the performance of vehicle recognition models in live videos. Ultimately, the proposed metrics could also prove intuitively effective at quantitative model evaluation in other video recognition applications. T One major advantage of the proposed vehicle search and continuous localization framework is that it could be integrated in ITS software solution to aid law enforcement, especially in critical cases such as of amber alerts or hit-and-run incidents.

Published in AI

ISSN: 2673-2688 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/ai

About the journal

Abstract

Keywords