FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

Thien-Thanh Dao; Quoc-Viet Pham; Won-Joo Hwang

doi:10.1109/ACCESS.2022.3145969

IEEE Access (Jan 2022)

FastMDE: A Fast CNN Architecture for Monocular Depth Estimation at High Resolution

Thien-Thanh Dao,
Quoc-Viet Pham,
Won-Joo Hwang

Affiliations

Thien-Thanh Dao: ORCiD; Department of Computer Engineering, Pusan National University, Yangsan-si, South Korea
Quoc-Viet Pham: ORCiD; Korean Southeast Center for the 4th Industrial Revolution Leader Education, Pusan National University, Busan, South Korea
Won-Joo Hwang: ORCiD; Department of Biomedical Convergence Engineering, Pusan National University, Yangsan-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2022.3145969
Journal volume & issue: Vol. 10
pp. 16111 – 16122

Abstract

Read online

A depth map helps robots and autonomous vehicles (AVs) visualize the three-dimensional world to navigate and localize neighboring obstacles. However, it is difficult to develop a deep learning model that can estimate the depth map from a single image in real-time. This study proposes a fast monocular depth estimation model named FastMDE by optimizing the deep convolutional neural network according to the encoder-decoder architecture. The decoder needs to obtain partial and semantic feature maps from the encoding phase to improve the depth estimation accuracy. Therefore, we designed FastMDE with two effective strategies. The first one involved redesigning the skip connection with the features of the squeeze-excitation module to obtain partial and semantic feature maps of the encoding phase. The second strategy involved redesigning the decoder by using the fusion dense block to permit the usage of high-resolution features that were learned earlier in the network before upsampling. The proposed FastMDE model utilizes only 4.1 M parameters, which is much lesser than the parameters utilized by state-of-art models. Thus, FastDME has a higher accuracy and lower latency than previous models. This study also demonstrates that MDE can leverage deep neural networks in real-time (i.e., 30 fps) with the Linux embedded board Nvidia Jetson Xavier NX. The model can facilitate the development and applications with superior performances and easy deployment on an embedded platform.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords