IEEE Access (Jan 2024)

ST-PixLoc: A Scene-Agnostic Network for Enhanced Camera Localization

  • Jing Wang,
  • Yibo Wang,
  • Yuchu Jin,
  • Cheng Guo,
  • Xuhui Fan

DOI
https://doi.org/10.1109/ACCESS.2024.3435851
Journal volume & issue
Vol. 12
pp. 105294 – 105308

Abstract

Read online

Visual localization is a significant problem in computer vision and robotics, involving estimating the six degrees of freedom pose of a camera relative to a known environment based on captured images. Most DL-based visual localization methods exhibit poor generalization capabilities. While some scene-independent visual localization methods demonstrate satisfactory generalization, they often suffer from low localization accuracy. To address the issue of low accuracy in scene-independent methods and the indiscriminate fusion of channel and spatial information when using neural networks for feature extraction, we propose a visual localization method ST-PixLoc, which effectively leverages image edge gradient information using the PixLoc framework. Firstly, we optimize the gradients of input images to enhance the weighting of gradient values within the image. Secondly, we employ ResNet50 as the feature extraction network’s downsampling layers to enhance feature extraction capability, while introducing channel attention mechanisms in the upsampling layers of the feature extraction network. Notably, this mechanism focuses on relevant information and resolves the aforementioned indiscriminate fusion problem. Lastly, based on the feature maps of the considered images, we compute feature residuals and optimize the initial pose using optimization algorithms. Additionally, we optimize the loss function to improve the model’s accuracy in complex scenes. Experimental results demonstrate that the proposed method achieves high-precision localization. The average rotation and translation errors on the indoor 7-Scenes dataset increased by 6.9% and 9.7%, respectively, while those on the outdoor Cambridge Landmarks dataset increased by 16.7% and 28.2%, validating the effectiveness of the proposed approach.

Keywords