Transformer With Linear-Window Attention for Feature Matching

Zhiwei Shen; Bin Kong; Xiaoyu Dong

doi:10.1109/ACCESS.2023.3328855

IEEE Access (Jan 2023)

Transformer With Linear-Window Attention for Feature Matching

Zhiwei Shen,
Bin Kong,
Xiaoyu Dong

Affiliations

Zhiwei Shen: ORCiD; Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, China
Bin Kong: Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, China
Xiaoyu Dong: Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, China

DOI: https://doi.org/10.1109/ACCESS.2023.3328855
Journal volume & issue: Vol. 11
pp. 121202 – 121211

Abstract

Read online

A transformer can capture long-term dependencies through an attention mechanism, and hence, can be applied to various vision tasks. However, its secondary computational complexity is a major obstacle in vision tasks that require accurate predictions. To address this limitation, this study introduces linear-window attention (LWA), a new attention model for a vision transformer. The transformer computes self-attention that is restricted to nonoverlapping local windows and represented as a linear dot product of kernel feature mappings. Furthermore, the computational complexity of each window is reduced to linear from quadratic using the constraint property of matrix products. In addition, we applied the LWA to feature matching to construct a coarse-to-fine-level detector-free feature matching method, called transformer with linear-window attention for feature matching TRLWAM. At the coarse level, we extracted the dense pixel-level matches, and at the fine level, we obtained the final matching results via multi-head multilayer perceptron refinement. We demonstrated the effectiveness of LWA through Replace experiments. The results showed that the TRLWAM could extract dense matches from low-texture or repetitive pattern regions in indoor environments, and exhibited excellent results with a low computational cost for MegaDepth and HPatches datasets. We believe the proposed LWA can provide new conceptions for transformer applications in visual tasks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords