An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting

Ming Yan; Zhongtong Li; Xinyan Yu; Cong Jin

doi:10.1109/ACCESS.2020.3012695

IEEE Access (Jan 2020)

An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting

Ming Yan,
Zhongtong Li,
Xinyan Yu,
Cong Jin

Affiliations

Ming Yan: ORCiD; State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
Zhongtong Li: School of Information and Telecommunications Engineering, Communication University of China, Beijing, China
Xinyan Yu: School of Data Science and Media Intelligence, Communication University of China, Beijing, China
Cong Jin: ORCiD; School of Information and Telecommunications Engineering, Communication University of China, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2020.3012695
Journal volume & issue: Vol. 8
pp. 138810 – 138822

Abstract

Read online

Existing outdoor three-dimensional (3D) object detection algorithms mainly use a single type of sensor, for example, only using a monocular camera or radar point cloud. However, camera sensors are affected by light and lose depth information. When scanning a distant object or an occluded object, the data collected by the short-range radar point cloud sensor are very sparse, which affects the detection algorithm. To address the above challenges, we design a deep learning network that can combine the texture information of two-dimensional (2D) data and the geometric information of 3D data for object detection. To solve the problem of a single sensor, we use a reverse mapping layer and an aggregation layer to combine the texture information of RGB data with the geometric information of point cloud data and design a maximum pooling layer to deal with the input of multi-view cameras. In addition, to solve the defects of the 3D object detection algorithm based on the region proposal network (RPN) method, we use the Hough voting algorithm implemented by a deep neural network to suggest objects. Experimental results show that our algorithm has a 1.06% decrease in average precision (AP) compared to PointRCNN in easy car object detection, but our algorithm requires 37.7% less time to calculate than PointRCNN under the same hardware environment. Moreover, our algorithm improves the AP by 1.14% compared to PointRCNN in hard car object detection.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords