Boosting 3D point-based object detection by reducing information loss caused by discontinuous receptive fields

Ao Liang; Haiyang Hua; Jian Fang; Huaici Zhao; Tianci Liu

International Journal of Applied Earth Observations and Geoinformation (Aug 2024)

Boosting 3D point-based object detection by reducing information loss caused by discontinuous receptive fields

Ao Liang,
Haiyang Hua,
Jian Fang,
Huaici Zhao,
Tianci Liu

Affiliations

Ao Liang: Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, 110016, China; Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; Institutes for Robotics and Intelligent Manufacturing, Shenyang, 110169, China; University of Chinese Academy of Sciences, Beijing, 100049, China
Haiyang Hua: Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, 110016, China; Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; Institutes for Robotics and Intelligent Manufacturing, Shenyang, 110169, China; Key Laboratory of Optical Information and Simulation Technology, Shenyang, 110016, China
Jian Fang: Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, 110016, China; Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; Institutes for Robotics and Intelligent Manufacturing, Shenyang, 110169, China; Key Laboratory of Optical Information and Simulation Technology, Shenyang, 110016, China
Huaici Zhao: Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, 110016, China; Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; Institutes for Robotics and Intelligent Manufacturing, Shenyang, 110169, China; Key Laboratory of Optical Information and Simulation Technology, Shenyang, 110016, China; Correspondence to: No. 135 Chuangxin Road, Hunnan District, Shenyang City, Liaoning Province.
Tianci Liu: Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, 110016, China; Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; Institutes for Robotics and Intelligent Manufacturing, Shenyang, 110169, China

Journal volume & issue: Vol. 132
p. 104049

Abstract

Read online

The point-based 3D object detection method is highly advantageous due to its lightweight nature and fast inference speed, making it a valuable asset in engineering fields such as intelligent transportation and autonomous driving. However, current advanced methods solely focus on learning features from the provided point cloud, neglecting the active role of unoccupied space. This results in the problem of discontinuous receptive field (DRF), leading to the loss of semantic and geometric information of the objects. To address this issue, we propose a new end-to-end single-stage point-based model, DRF-SSD, in this paper. DRF-SSD utilizes a PointNet++-style 3D backbone to maintain fast inference capability. Then, point-wise features are projected onto a plane in the Neck structure, and local and global information are aggregated through the designed Hierarchical Encoding–Decoding (HED) and Hybrid Transformer (HT) modules. The former fills in features for unoccupied space through convolutional layers, enhancing local features by interacting with features in occupied space during the learning process. The latter further expands the receptive field using the global learning ability of transformers. The spatial transformation and learning processes in HED and HT only involve key points, and HED is designed to have a special structure that maintains the sparsity of feature maps, preserving the efficiency of the model’s inference. Finally, query features are back-projected onto points for feature enhancement and input into the detection head for prediction. Extensive experiments on the KITTI datasets demonstrate that DRF-SSD achieves superior detection accuracy compared to previous methods, with significant improvements. Specifically, the approach obtains 2.25%, 0.66%, and 0.42% improvement for the metric of 3D Average Precision (AP3D) under the easy, moderate, and hard settings, respectively. Additionally, the method enables other point-based detectors to achieve substantial gains, demonstrating its effectiveness. Our code will be made available at https://github.com/AlanLiangC/DRF-SSD.git.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords