MENet: Map-enhanced 3D object detection in bird’s-eye view for LiDAR point clouds

Yuanxian Huang; Jian Zhou; Xicheng Li; Zhen Dong; Jinsheng Xiao; Shurui Wang; Hongjuan Zhang

International Journal of Applied Earth Observations and Geoinformation (Jun 2023)

MENet: Map-enhanced 3D object detection in bird’s-eye view for LiDAR point clouds

Yuanxian Huang,
Jian Zhou,
Xicheng Li,
Zhen Dong,
Jinsheng Xiao,
Shurui Wang,
Hongjuan Zhang

Affiliations

Yuanxian Huang: State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430070, China
Jian Zhou: State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430070, China; Corresponding author.
Xicheng Li: School of Geodesy and Geomatics, Wuhan University, Wuhan 430070, China
Zhen Dong: State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430070, China
Jinsheng Xiao: Electronic Information School, Wuhan University, Wuhan 430070, China
Shurui Wang: Electronic Information School, Wuhan University, Wuhan 430070, China
Hongjuan Zhang: State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430070, China

Journal volume & issue: Vol. 120
p. 103337

Abstract

Read online

Three-dimensional (3D) object detection utilizes numerous onboard sensors to determine the position, size, and motion information of surrounding objects. Recently, some researchers have utilized HD maps in 3D object detection for LiDAR point clouds. However, existing LiDAR–map fusion detection methods simply take the HD map as an additional input and more prior information in the map is underexplored. To address this limitation, we employ the HD map in data augmentation for LiDAR 3D detection to eliminate the bias of distributions between the augmented training and test sets. Furthermore, we propose a map-enhanced 3D object detection network, called MENet, for LiDAR point clouds. Compared with the standard LiDAR-only detector in bird’s eye view (BEV), MENet extends LiDAR-based baselines by using a map encoder to encode the input rasterized HD map into a BEV feature. In addition, it uses a multimodal fuser to aggregate the map BEV feature with the LiDAR BEV feature extracted in the LiDAR stream. MENet achieved a mean average precision (mAP) of 56.90% and nuScenes detection score (NDS) of 63.43%, improving the mAP and NDS of its LiDAR-only backbone by 4.18% and 2.22%, respectively. Although the HD map is necessary at training time, the map-enhanced 3D detector can work robustly and perform better than the original baseline when there was no map input at test time. Experimental results further indicate that the proposed map plugins can improve the performance of both anchor-based and anchor-free models on different datasets. The code will be open-source at https://github.com/WHU-USI3DV/MENet.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords