IEEE Access (Jan 2020)

An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting

  • Ming Yan,
  • Zhongtong Li,
  • Xinyan Yu,
  • Cong Jin

DOI
https://doi.org/10.1109/ACCESS.2020.3012695
Journal volume & issue
Vol. 8
pp. 138810 – 138822

Abstract

Read online

Existing outdoor three-dimensional (3D) object detection algorithms mainly use a single type of sensor, for example, only using a monocular camera or radar point cloud. However, camera sensors are affected by light and lose depth information. When scanning a distant object or an occluded object, the data collected by the short-range radar point cloud sensor are very sparse, which affects the detection algorithm. To address the above challenges, we design a deep learning network that can combine the texture information of two-dimensional (2D) data and the geometric information of 3D data for object detection. To solve the problem of a single sensor, we use a reverse mapping layer and an aggregation layer to combine the texture information of RGB data with the geometric information of point cloud data and design a maximum pooling layer to deal with the input of multi-view cameras. In addition, to solve the defects of the 3D object detection algorithm based on the region proposal network (RPN) method, we use the Hough voting algorithm implemented by a deep neural network to suggest objects. Experimental results show that our algorithm has a 1.06% decrease in average precision (AP) compared to PointRCNN in easy car object detection, but our algorithm requires 37.7% less time to calculate than PointRCNN under the same hardware environment. Moreover, our algorithm improves the AP by 1.14% compared to PointRCNN in hard car object detection.

Keywords