Известия высших учебных заведений России: Радиоэлектроника (Jun 2021)
Method for Automatic Determination of a 3D Trajectory of Vehicles in a Video Image
Abstract
Introduction. An important part of an automotive unmanned vehicle (UV) control system is the environment analysis module. This module is based on various types of sensors, e.g. video cameras, lidars and radars. The development of computer and video technologies makes it possible to implement an environment analysis module using a single video camera as a sensor. This approach is expected to reduce the cost of the entire module. The main task in video image processing is to analyse the environment as a 3D scene. The 3D trajectory of an object, which takes into account its dimensions, angle of view and movement vector, as well as the vehicle pose in a video image, provides sufficient information for assessing the real interaction of objects. A basis for constructing a 3D trajectory is vehicle pose estimation.Aim. To develop an automatic method for estimating vehicle pose based on video data analysis from a single video camera.Materials and methods. An automatic method for vehicle pose estimation from a video image was proposed based on a cascade approach. The method includes vehicle detection, key points determination, segmentation and vehicle pose estimation. Vehicle detection and determination of its key points were resolved via a neural network. The segmentation of a vehicle video image and its mask preparation were implemented by transforming it into a polar coordinate system and searching for the outer contour using graph theory.Results. The estimation of vehicle pose was implemented by matching the Fourier image of vehicle mask signatures and the templates obtained based on 3D models. The correctness of the obtained vehicle pose and angle of view estimation was confirmed by experiments based on the proposed method. The vehicle pose estimation had an accuracy of 89 % on an open Carvana image dataset.Conclusion. A new approach for vehicle pose estimation was proposed, involving the transition from end-to-end learning of neural networks to resolve several problems at once, e.g., localization, classification, segmentation, and angle of view, towards cascade analysis of information. The accuracy level of end-to-end learning requires large sets of representative data, which complicates the scalability of solutions for road environments in Russia. The proposed method makes it possible to estimate the vehicle pose with a high accuracy level, at the same time as involving no large costs for manual data annotation and training.
Keywords