Fully convolutional multi‐scale dense networks for monocular depth estimation

Jiwei Liu; Yunzhou Zhang; Jiahua Cui; Yonghui Feng; Linzhuo Pang

doi:10.1049/iet-cvi.2018.5645

IET Computer Vision (Aug 2019)

Fully convolutional multi‐scale dense networks for monocular depth estimation

Jiwei Liu,
Yunzhou Zhang,
Jiahua Cui,
Yonghui Feng,
Linzhuo Pang

Affiliations

Jiwei Liu: College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of China
Yunzhou Zhang: College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of China
Jiahua Cui: College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of China
Yonghui Feng: College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of China
Linzhuo Pang: College of Information Science and Engineering, Northeastern UniversityShenyangPeople's Republic of China

DOI: https://doi.org/10.1049/iet-cvi.2018.5645
Journal volume & issue: Vol. 13, no. 5
pp. 515 – 522

Abstract

Read online

Monocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill‐posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for which the authors extend Densely Connected Convolutional Network (DenseNet) to work as end‐to‐end fully convolutional multi‐scale dense networks. The dense upsampling blocks are integrated to improve the output resolution and selected skip connection is incorporated to connect the downsampling and the upsampling paths efficiently. The other is about edge‐preserving loss functions, encompassing the reverse Huber loss, depth gradient loss and feature edge loss, which is particularly suited for estimation of fine details and clear boundaries of objects. Experiments on the NYU‐Depth‐v2 dataset and KITTI dataset show that the proposed model is competitive to the state‐of‐the‐art methods, achieving 0.506 and 4.977 performance in terms of root mean squared error respectively.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords