Multi-View Joint Learning and BEV Feature-Fusion Network for 3D Object Detection

Qunming Liu; Xiaodong Li; Xiaofei Zhang; Xiaojun Tan; Bodong Shi

doi:10.3390/app13095274

Applied Sciences (Apr 2023)

Multi-View Joint Learning and BEV Feature-Fusion Network for 3D Object Detection

Qunming Liu,
Xiaodong Li,
Xiaofei Zhang,
Xiaojun Tan,
Bodong Shi

Affiliations

Qunming Liu: School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
Xiaodong Li: School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
Xiaofei Zhang: School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
Xiaojun Tan: School of Intelligent Systems Engineering, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai 519082, China
Bodong Shi: School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China

DOI: https://doi.org/10.3390/app13095274
Journal volume & issue: Vol. 13, no. 9
p. 5274

Abstract

Read online

Traditional 3D object detectors use BEV (bird’s eye view) feature maps to generate 3D object proposals, but in a single BEV feature map, there are inevitably the problems of high compression and information loss. To solve this problem, we propose a multi-view joint learning and BEV feature-fusion network. In this network, we mainly propose two fusion modules: the multi-view feature-fusion module and the multi-BEV feature-fusion module. The multi-view feature fusion module performs joint learning from multiple views, fusing features learned from multiple views, and supplementing missing information in the BEV feature map. The multi-BEV feature-fusion module fuses BEV feature map outputs from different feature extractors to further enrich the feature information in the BEV feature map, in order to generate better quality 3D object proposals. We conducted experiments on a widely used KITTI dataset. The results show that our method has significantly improved the detection accuracy of the cyclist category.For cyclist detection tasks at the easy, moderate, and hard levels of the KITTI test dataset, our method improves by 1.57%, 2.03%, and 0.67%, respectively, compared to PV-RCNN.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords