IEEE Access (Jan 2023)
A Review of Recurrent Neural Network Based Camera Localization for Indoor Environments
Abstract
Camera localization involves the estimation of the camera pose of an image from a random scene. We used a single image or sequence of images or videos as the input. The output depends on the representation of the scene and method used. Several computer vision applications, such as robot navigation and safety inspection, can benefit from camera localization. Camera localization is used to determine the position of an object on the camera in an image containing multiple images in a sequence. Structure-based localization techniques have achieved considerable success owing to a combination of image matching and coordinate regression. Absolute and relative pose regression techniques can provide end-to-end learning; however, they exhibit poor accuracies. Despite the rapid growth in computer vision, there has been no thorough review of the categorization, evaluation, and synthesis of structures and regression-based techniques. Input format and loss strategies for recurrent neural networks (RNN) have not been adequately described in the literature. The main topic is indoor camera pose regression, which is a part of the camera localization techniques. First, we discuss certain application areas for camera localization. We then discuss different camera localization techniques, such as feature and structure-based, absolute and relative pose regression techniques, and simultaneous localization and mapping (SLAM). We evaluated the frequently used datasets and qualitatively compared the absolute and relative camera pose estimation approaches. Finally, we discuss potential directions for future research, such as optimizing the computational cost of the features and evaluating the end-to-end characteristics of multiple cameras.
Keywords