LapUNet: a novel approach to monocular depth estimation using dynamic laplacian residual U-shape networks

Yanhui Xi; Sai Li; Zhikang Xu; Feng Zhou; Juanxiu Tian

doi:10.1038/s41598-024-74445-x

Scientific Reports (Oct 2024)

LapUNet: a novel approach to monocular depth estimation using dynamic laplacian residual U-shape networks

Yanhui Xi,
Sai Li,
Zhikang Xu,
Feng Zhou,
Juanxiu Tian

Affiliations

Yanhui Xi: School of Electrical and Information Engineering, Changsha University of Science and Technology
Sai Li: School of Electrical and Information Engineering, Changsha University of Science and Technology
Zhikang Xu: School of Electrical and Information Engineering, Changsha University of Science and Technology
Feng Zhou: School of Electronic Information and Electrical Engineering, Changsha University
Juanxiu Tian: College of Computer and Communication, Hunan Institute of Engineering

DOI: https://doi.org/10.1038/s41598-024-74445-x
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Monocular depth estimation is an important but challenging task. Although the performance has been improved by adopting various encoder-decoder architectures, the estimated depth maps lack structure details and clear edges due to simple repeated upsampling. To solve this problem, this paper presents the novel LapUNet (Laplacian U-shape networks), in which the encoder adopts ResNeXt101, and the decoder is constructed with the novel DLRU (dynamic Laplacian residual U-shape) module. The DLRU module based on the U-shape structure can supplement high-frequency features by fusing dynamic Laplacian residual into the process of upsampling, and the residual is dynamically learnable due to the addition of convolutional operation. Also, the ASPP (atrous spatial pyramid pooling) module is introduced to capture image context at multiple scales though multiple parallel atrous convolutional layers, and the depth map fusion module is used for combining high and low frequency features from depth maps with different spatial resolution. Experiments demonstrate that the proposed model with moderate model size is superior to other previous competitors on the KITTI and NYU Depth V2 datasets. Furthermore, 3D reconstruction and target ranging by utilizing the estimated depth maps prove the effectiveness of our proposed method.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords