SELF-SUPERVISED LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM AERIAL IMAGERY

M. Hermann; M. Hermann; M. Hermann; B. Ruf; B. Ruf; B. Ruf; M. Weinmann; S. Hinz

doi:10.5194/isprs-annals-V-2-2020-357-2020

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Aug 2020)

SELF-SUPERVISED LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM AERIAL IMAGERY

M. Hermann,
M. Hermann,
M. Hermann,
B. Ruf,
B. Ruf,
B. Ruf,
M. Weinmann,
S. Hinz

Affiliations

M. Hermann: Fraunhofer IOSB, Karlsruhe, Germany
M. Hermann: Institute of Photogrammetry and Remote Sensing, KIT, Karlsruhe, Germany
M. Hermann: Fraunhofer Center for Machine Learning
B. Ruf: Fraunhofer IOSB, Karlsruhe, Germany
B. Ruf: Institute of Photogrammetry and Remote Sensing, KIT, Karlsruhe, Germany
B. Ruf: Fraunhofer Center for Machine Learning
M. Weinmann: Institute of Photogrammetry and Remote Sensing, KIT, Karlsruhe, Germany
S. Hinz: Institute of Photogrammetry and Remote Sensing, KIT, Karlsruhe, Germany

DOI: https://doi.org/10.5194/isprs-annals-V-2-2020-357-2020
Journal volume & issue: Vol. V-2-2020
pp. 357 – 364

Abstract

Read online

Supervised learning based methods for monocular depth estimation usually require large amounts of extensively annotated training data. In the case of aerial imagery, this ground truth is particularly difficult to acquire. Therefore, in this paper, we present a method for self-supervised learning for monocular depth estimation from aerial imagery that does not require annotated training data. For this, we only use an image sequence from a single moving camera and learn to simultaneously estimate depth and pose information. By sharing the weights between pose and depth estimation, we achieve a relatively small model, which favors real-time application. We evaluate our approach on three diverse datasets and compare the results to conventional methods that estimate depth maps based on multi-view geometry. We achieve an accuracy δ1:25 of up to 93.5 %. In addition, we have paid particular attention to the generalization of a trained model to unknown data and the self-improving capabilities of our approach. We conclude that, even though the results of monocular depth estimation are inferior to those achieved by conventional methods, they are well suited to provide a good initialization for methods that rely on image matching or to provide estimates in regions where image matching fails, e.g. occluded or texture-less regions.

Published in ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences

ISSN: 2194-9042 (Print); 2194-9050 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Technology: Engineering (General). Civil engineering (General): Applied optics. Photonics
Website: http://www.isprs.org/publications/annals.aspx

About the journal