P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds

Jiale Li; Yu Sun; Shujie Luo; Ziqi Zhu; Hang Dai; Andrey S. Krylov; Yong Ding; Ling Shao

doi:10.1109/ACCESS.2021.3094562

IEEE Access (Jan 2021)

P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds

Jiale Li,
Yu Sun,
Shujie Luo,
Ziqi Zhu,
Hang Dai,
Andrey S. Krylov,
Yong Ding,
Ling Shao

Affiliations

Jiale Li: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Yu Sun: ORCiD; School of Micro-Nano Electronics, Zhejiang University, Hangzhou, China
Shujie Luo: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Ziqi Zhu: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Hang Dai: ORCiD; Computer Vision Department, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Andrey S. Krylov: ORCiD; Laboratory of Mathematical Methods of Image Processing, Lomonosov Moscow State University, Moscow, Russia
Yong Ding: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Ling Shao: ORCiD; Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates

DOI: https://doi.org/10.1109/ACCESS.2021.3094562
Journal volume & issue: Vol. 9
pp. 98249 – 98260

Abstract

Read online

The most recent 3D object detectors for point clouds rely on the coarse voxel-based representation rather than the accurate point-based representation due to a higher box recall in the voxel-based Region Proposal Network (RPN). However, the detection accuracy is severely restricted by the information loss of pose details in the voxels. Different from considering the point cloud as voxel or point representation only, we propose a point-to-voxel feature learning approach to voxelize the point cloud with both the point-wise semantic and local spatial features, which maintains the voxel-wise features to build the high-recall voxel-based RPN and also provides the accurate point-wise features for refining the detection results. Another difficulty in object detection for point cloud is that the visible part varies a lot against the full view of object because of the perspective issues in data acquisition. To address this, we propose an attentive corner aggregation module to attentively aggregate the features of local point cloud surrounding a 3D proposal from the perspectives of eight corners in the proposal 3D bounding box. The experimental results on the competitive KITTI 3D object detection benchmark show that the proposed method achieves state-of-the-art performance.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords