FCNet: Stereo 3D Object Detection with Feature Correlation Networks

Yingyu Wu; Ziyan Liu; Yunlei Chen; Xuhui Zheng; Qian Zhang; Mo Yang; Guangming Tang

doi:10.3390/e24081121

Entropy (Aug 2022)

FCNet: Stereo 3D Object Detection with Feature Correlation Networks

Yingyu Wu,
Ziyan Liu,
Yunlei Chen,
Xuhui Zheng,
Qian Zhang,
Mo Yang,
Guangming Tang

Affiliations

Yingyu Wu: College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
Ziyan Liu: College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
Yunlei Chen: College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
Xuhui Zheng: College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
Qian Zhang: College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
Mo Yang: College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China
Guangming Tang: Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China

DOI: https://doi.org/10.3390/e24081121
Journal volume & issue: Vol. 24, no. 8
p. 1121

Abstract

Read online

Deep-learning techniques have significantly improved object detection performance, especially with binocular images in 3D scenarios. To supervise the depth information in stereo 3D object detection, reconstructing the 3D dense depth of LiDAR point clouds causes higher computational costs and lower inference speed. After exploring the intrinsic relationship between the implicit depth information and semantic texture features of the binocular images, we propose an efficient and accurate 3D object detection algorithm, FCNet, in stereo images. First, we construct a multi-scale cost–volume containing implicit depth information using the normalized dot-product by generating multi-scale feature maps from the input stereo images. Secondly, the variant attention model enhances its global and local description, and the sparse region monitors the depth loss deep regression. Thirdly, for balancing the channel information preservation of the re-fused left–right feature maps and computational burden, a reweighting strategy is employed to enhance the feature correlation in merging the last-layer features of binocular images. Extensive experiment results on the challenging KITTI benchmark demonstrate that the proposed algorithm achieves better performance, including a lower computational cost and higher inference speed in 3D object detection.

Published in Entropy

ISSN: 1099-4300 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Astronomy: Astrophysics; Science: Physics
Website: http://www.mdpi.com/journal/entropy

About the journal

Abstract

Keywords