Self-Supervised Monocular Depth Learning in Low-Texture Areas

Wanpeng Xu; Ling Zou; Lingda Wu; Zhipeng Fu

doi:10.3390/rs13091673

Remote Sensing (Apr 2021)

Self-Supervised Monocular Depth Learning in Low-Texture Areas

Wanpeng Xu,
Ling Zou,
Lingda Wu,
Zhipeng Fu

Affiliations

Wanpeng Xu: Science and Technology on Complex Electronic System Simulation Laboratory, Space Engineering University, Beijing 101416, China
Ling Zou: Digital Media School, Beijing Film Academy, Beijing 100088, China
Lingda Wu: Science and Technology on Complex Electronic System Simulation Laboratory, Space Engineering University, Beijing 101416, China
Zhipeng Fu: Peng Cheng Laboratory, Shenzhen 518055, China

DOI: https://doi.org/10.3390/rs13091673
Journal volume & issue: Vol. 13, no. 9
p. 1673

Abstract

Read online

For the task of monocular depth estimation, self-supervised learning supervises training by calculating the pixel difference between the target image and the warped reference image, obtaining results comparable to those with full supervision. However, the problematic pixels in low-texture regions are ignored, since most researchers think that no pixels violate the assumption of camera motion, taking stereo pairs as the input in self-supervised learning, which leads to the optimization problem in these regions. To tackle this problem, we perform photometric loss using the lowest-level feature maps instead and implement first- and second-order smoothing to the depth, ensuring consistent gradients ring optimization. Given the shortcomings of ResNet as the backbone, we propose a new depth estimation network architecture to improve edge location accuracy and obtain clear outline information even in smoothed low-texture boundaries. To acquire more stable and reliable quantitative evaluation results, we introce a virtual data set in the self-supervised task because these have dense depth maps corresponding to pixel by pixel. We achieve performance that exceeds that of the prior methods on both the Eigen Splits of the KITTI and VKITTI2 data sets taking stereo pairs as the input.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords