Heliyon (Sep 2024)

Monocular visual SLAM, visual odometry, and structure from motion methods applied to 3D reconstruction: A comprehensive survey

  • Erick P. Herrera-Granda,
  • Juan C. Torres-Cantero,
  • Diego H. Peluffo-Ordóñez

Journal volume & issue
Vol. 10, no. 18
p. e37356

Abstract

Read online

Monocular Simultaneous Localization and Mapping (SLAM), Visual Odometry (VO), and Structure from Motion (SFM) are techniques that have emerged recently to address the problem of reconstructing objects or environments using monocular cameras. Monocular pure visual techniques have become attractive solutions for 3D reconstruction tasks due to their affordability, lightweight, easy deployment, good outdoor performance, and availability in most handheld devices without requiring additional input devices. In this work, we comprehensively overview the SLAM, VO, and SFM solutions for the 3D reconstruction problem that uses a monocular RGB camera as the only source of information to gather basic knowledge of this ill-posed problem and classify the existing techniques following a taxonomy. To achieve this goal, we extended the existing taxonomy to cover all the current classifications in the literature, comprising classic, machine learning, direct, indirect, dense, and sparse methods. We performed a detailed overview of 42 methods, considering 18 classic and 24 machine learning methods according to the ten categories defined in our extended taxonomy, comprehensively systematizing their algorithms and providing their basic formulations. Relevant information about each algorithm was summarized in nine criteria for classic methods and eleven criteria for machine learning methods to provide the reader with decision components to implement, select or design a 3D reconstruction system. Finally, an analysis of the temporal evolution of each category was performed, which determined that the classical-sparse-indirect and classical-dense-indirect categories have been the most accepted solutions to the monocular 3D reconstruction problem over the last 18 years.

Keywords