Sensors (Feb 2019)

Predicting Depth from Single RGB Images with Pyramidal Three-Streamed Networks

  • Songnan Chen,
  • Mengxia Tang,
  • Jiangming Kan

DOI
https://doi.org/10.3390/s19030667
Journal volume & issue
Vol. 19, no. 3
p. 667

Abstract

Read online

Predicting depth from a monocular image is an ill-posed and inherently ambiguous issue in computer vision. In this paper, we propose a pyramidal third-streamed network (PTSN) that recovers the depth information using a single given RGB image. PTSN uses pyramidal structure images, which can extract multiresolution features to improve the robustness of the network as the network input. The full connection layer is changed into fully convolutional layers with a new upconvolution structure, which reduces the network parameters and computational complexity. We propose a new loss function including scale-invariant, horizontal and vertical gradient loss that not only helps predict the depth values, but also clearly obtains local contours. We evaluate PTSN on the NYU Depth v2 dataset and the experimental results show that our depth predictions have better accuracy than competing methods.

Keywords