IEEE Access (Jan 2021)

Self-Supervised Monocular Depth Estimation With Extensive Pretraining

  • Hyukdoo Choi

DOI
https://doi.org/10.1109/ACCESS.2021.3129628
Journal volume & issue
Vol. 9
pp. 157236 – 157246

Abstract

Read online

Although depth estimation is a key technology for three-dimensional sensing applications involving motion, active sensors such as LiDAR and depth cameras tend to be expensive and bulky. Here, we explore the potential of monocular depth estimation (MDE) based on a self-supervised approach. MDE is a promising technology, but supervised learning suffers from a need for accurate ground-truth depth data. Recent studies have enabled self-supervised training on an MDE model with only monocular image sequences and image-reconstruction errors. We pretrained networks using multiple datasets, including monocular and stereo image sequences. The main challenges posed by the self-supervised MDE model were occlusions and dynamic objects. We proposed novel loss functions to handle these problems in the form of min-over-all and min-with-flow losses, both based on the per-pixel minimum reprojection error of Monodepth2 and extended to stereo images and optical flow. With extensive pretraining and novel losses, our model outperformed existing unsupervised approaches in quantitative depth estimation and the ability to distinguish small objects against a background, as evaluated by KITTI 2015.

Keywords