Predicting Depth from Single RGB Images with Pyramidal Three-Streamed Networks

Songnan Chen; Mengxia Tang; Jiangming Kan

doi:10.3390/s19030667

Sensors (Feb 2019)

Predicting Depth from Single RGB Images with Pyramidal Three-Streamed Networks

Songnan Chen,
Mengxia Tang,
Jiangming Kan

Affiliations

Songnan Chen: School of Technology, Beijing Forestry University, No. 35 Qinghua East Road, Haidian District, Beijing 100083, China
Mengxia Tang: School of Technology, Beijing Forestry University, No. 35 Qinghua East Road, Haidian District, Beijing 100083, China
Jiangming Kan: School of Technology, Beijing Forestry University, No. 35 Qinghua East Road, Haidian District, Beijing 100083, China

DOI: https://doi.org/10.3390/s19030667
Journal volume & issue: Vol. 19, no. 3
p. 667

Abstract

Read online

Predicting depth from a monocular image is an ill-posed and inherently ambiguous issue in computer vision. In this paper, we propose a pyramidal third-streamed network (PTSN) that recovers the depth information using a single given RGB image. PTSN uses pyramidal structure images, which can extract multiresolution features to improve the robustness of the network as the network input. The full connection layer is changed into fully convolutional layers with a new upconvolution structure, which reduces the network parameters and computational complexity. We propose a new loss function including scale-invariant, horizontal and vertical gradient loss that not only helps predict the depth values, but also clearly obtains local contours. We evaluate PTSN on the NYU Depth v2 dataset and the experimental results show that our depth predictions have better accuracy than competing methods.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords