International Journal of Applied Earth Observations and Geoinformation (Aug 2024)
Boosting 3D point-based object detection by reducing information loss caused by discontinuous receptive fields
Abstract
The point-based 3D object detection method is highly advantageous due to its lightweight nature and fast inference speed, making it a valuable asset in engineering fields such as intelligent transportation and autonomous driving. However, current advanced methods solely focus on learning features from the provided point cloud, neglecting the active role of unoccupied space. This results in the problem of discontinuous receptive field (DRF), leading to the loss of semantic and geometric information of the objects. To address this issue, we propose a new end-to-end single-stage point-based model, DRF-SSD, in this paper. DRF-SSD utilizes a PointNet++-style 3D backbone to maintain fast inference capability. Then, point-wise features are projected onto a plane in the Neck structure, and local and global information are aggregated through the designed Hierarchical Encoding–Decoding (HED) and Hybrid Transformer (HT) modules. The former fills in features for unoccupied space through convolutional layers, enhancing local features by interacting with features in occupied space during the learning process. The latter further expands the receptive field using the global learning ability of transformers. The spatial transformation and learning processes in HED and HT only involve key points, and HED is designed to have a special structure that maintains the sparsity of feature maps, preserving the efficiency of the model’s inference. Finally, query features are back-projected onto points for feature enhancement and input into the detection head for prediction. Extensive experiments on the KITTI datasets demonstrate that DRF-SSD achieves superior detection accuracy compared to previous methods, with significant improvements. Specifically, the approach obtains 2.25%, 0.66%, and 0.42% improvement for the metric of 3D Average Precision (AP3D) under the easy, moderate, and hard settings, respectively. Additionally, the method enables other point-based detectors to achieve substantial gains, demonstrating its effectiveness. Our code will be made available at https://github.com/AlanLiangC/DRF-SSD.git.