Learning to Fuse Multiscale Features for Visual Place Recognition

Jun Mao; Xiaoping Hu; Xiaofeng He; Lilian Zhang; Liao Wu; Michael J. Milford

doi:10.1109/access.2018.2889030

IEEE Access (Jan 2019)

Learning to Fuse Multiscale Features for Visual Place Recognition

Jun Mao,
Xiaoping Hu,
Xiaofeng He,
Lilian Zhang,
Liao Wu,
Michael J. Milford

Affiliations

Jun Mao: ORCiD; Department of Automation, National University of Defense Technology, Changsha, China
Xiaoping Hu: Department of Automation, National University of Defense Technology, Changsha, China
Xiaofeng He: Department of Automation, National University of Defense Technology, Changsha, China
Lilian Zhang: Department of Automation, National University of Defense Technology, Changsha, China
Liao Wu: School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, Australia
Michael J. Milford: School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, Australia

DOI: https://doi.org/10.1109/access.2018.2889030
Journal volume & issue: Vol. 7
pp. 5723 – 5735

Abstract

Read online

Efficient and robust visual place recognition is of great importance to autonomous mobile robots. Recent work has shown that features learned from convolutional neural networks achieve impressed performance with efficient feature size, where most of them are pooled or aggregated from a convolutional feature map. However, convolutional filters only capture the appearance of their perceptive fields, which lack the considerations on how to combine the multiscale appearance for place recognition. In this paper, we propose a novel method to build a multiscale feature pyramid and present two approaches to use the pyramid to augment the place recognition capability. The first approach fuses the pyramid to obtain a new feature map, which has an awareness of both the local and semi-global appearance, and the second approach learns an attention model from the feature pyramid to weight the spatial grids on the original feature map. Both approaches combine the multiscale features in the pyramid to suppress the confusing local features while tackling the problem in two different ways. Extensive experiments have been conducted on benchmark datasets with varying degrees of appearance and viewpoint variations. The results show that the proposed approaches achieve superior performance over the networks without the multiscale feature fusion and the multiscale attention components. Analyses on the performance of using different feature pyramids are also provided.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords