Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments

Fuseini Mumuni; Alhassan Mumuni; Christian Kwaku Amuzuvi

Machine Learning with Applications (Dec 2022)

Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments

Fuseini Mumuni,
Alhassan Mumuni,
Christian Kwaku Amuzuvi

Affiliations

Fuseini Mumuni: Department of Electrical and Electronic Engineering, University of Mines and Technology (UMaT), Tarkwa, Ghana; Corresponding author.
Alhassan Mumuni: Department of Electrical and Electronic Engineering, Cape Coast Technical University, Cape Coast, Ghana
Christian Kwaku Amuzuvi: Department of Renewable Energy Engineering, University of Mines and Technology (UMaT), Tarkwa, Ghana

Journal volume & issue: Vol. 10
p. 100416

Abstract

Read online

Computer vision-based depth estimation and visual odometry provide perceptual information useful for robot navigation tasks like obstacle avoidance. However, despite the proliferation of state-of-the-art convolutional neural network (CNN) models for monocular depth, ego-motion and optical flow estimation, a relatively low volume of work has been reported on their practical applications in unmanned aerial vehicle (UAV) navigation. This is due to well-known challenges — embedded hardware constraints, viewpoint variations, scarcity of aerial image datasets, and intricacies of dynamic environments. We address these limitations to facilitate real-world deployment of CNN in UAV navigation. First, we devise efficient confidence weighted adaptive network (Cowan) training framework that iteratively leverages intermediate prediction confidences to enforce cross-task consistency over corresponding image regions. This achieves competitive accuracy with a lightweight CNN capable of real-time execution on resource-constrained embedded systems. Second, we devise a test-time refinement method that adapts the network to dynamic environments while simultaneously improving accuracy. To accomplish this, we first update ego-motion using pose information from on-board inertial measurement unit (IMU). Then, we decompose the UAV’s motion into constituent vectors, and for each axis, we formulate geometric relationships between depth and translation. Based on this information, we triangulate corresponding points acquired through optical flow. Finally, we enforce geometric consistency between the initially updated pose and triangulated depth. Cowan with geometric guided refinement (Cowan-GGR) achieves significant accuracy and robustness. Field tests show the proposed model is capable of accurate depth and object-level motion perception in real-world dynamic environments, thus proving its efficacy in facilitating UAV navigation.

Published in Machine Learning with Applications

ISSN: 2666-8270 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General): Cybernetics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/machine-learning-with-applications

About the journal

Abstract

Keywords