IEEE Access (Jan 2025)
Camera Absolute Pose Estimation Using Hierarchical Attention in Multi-Scene
Abstract
The multi-scene camera pose estimation approach aims to recover the camera pose from any given scene, catering to the demands of real-life mobile devices to perform tasks. Facing the challenge that it is difficult to extract efficient features in training multi-scene models, we present a modified model named Hierarchical Attention Absolute Pose Regression(H-AttnAPR) which can obtain different scales of feature dependencies. A Hierarchical Attention(HA) module is introduced prior to the scene classification module, where it captures both intra- and inter-correlations among image patches, utilizing both local and global key information from images to restore the absolute camera pose without the need for additional point cloud data. H-AttnAPR efficiently models global dependencies without compromising fine-grained feature information. Therefore, it overcomes the limitations that solely focus on long-range pixel-level feature dependencies within images while neglecting local patches of image feature dependencies. Our approach has been validated on the 7Scenes and Cambridge benchmark datasets. Compared to the baseline algorithm PoseNet, our algorithm has achieved a 41.1% reduction in translation error and a 61.1% decrease in rotation error, demonstrating superior performance in multi-scene absolute camera pose regression.
Keywords