IEEE Access (Jan 2021)
Vehicle Detection in Aerial Images Based on 3D Depth Maps and Deep Neural Networks
Abstract
Object detection in aerial images, particularly of vehicles, is highly important in remote sensing applications including traffic management, urban planning, parking space utilization, surveillance, and search and rescue. In this article, we investigate the ability of three-dimensional (3D) feature maps to improve the performance of deep neural network (DNN) for vehicle detection. First, we propose a DNN based on YOLOv3 with various base networks, including DarkNet-53, SqueezeNet, MobileNet-v2, and DenseNet-201. We assessed the base networks and their performance in combination with YOLOv3 on efficiency, processing time, and the memory that each architecture required. In the second part, 3D depth maps were generated using pairs of aerial images and their parallax displacement. Next, a fully connected neural network (fcNN) was trained on 3D feature maps of trucks, semi-trailers and trailers. A cascade of these networks was then proposed to detect vehicles in aerial images. Upon the DNN detecting a region, coordinates and confidence levels were used to extract the corresponding 3D features. The fcNN used 3D features as the input to improve the DNN performance. The data set used in this work was acquired from numerous flights of an unmanned aerial vehicle (UAV) across two industrial harbors over two years. The experimental results show that 3D features improved the precision of DNNs from 88.23 % to 96.43 % and from 97.10 % to 100 % when using DNN confidence thresholds of 0.01 and 0.05, respectively. Accordingly, the proposed system was able to successfully remove 72.22 % to 100 % of false positives from the DNN outputs. These results indicate the importance of 3D features utilization to improve object detection in aerial images for future research.
Keywords