IEEE Access (Jan 2024)

SiamSEA: Semantic-Aware Enhancement and Associative-Attention Dual-Modal Siamese Network for Robust RGBT Tracking

  • Zihan Zhuang,
  • Mingfeng Yin,
  • Qi Gao,
  • Yong Lin,
  • Xing Hong

DOI
https://doi.org/10.1109/ACCESS.2024.3442810
Journal volume & issue
Vol. 12
pp. 134874 – 134887

Abstract

Read online

Recently, RGBT tracking methods have been widely applied in visual tracking tasks owing to the complementarity of visible and thermal infrared images. However, in most RGBT trackers, since feature extraction network is not specifically trained for thermal infrared images, the expression of thermal radiation information in tracking task is incomplete. To solve the above problem, a novel RGBT Siamese tracker SiamSEA is proposed to enhance expression of different modal features. Firstly, a semantic-aware enhancement (SE) module is applied to strengthen features in visible images by fusing complementary information. Secondly, for different backgrounds in dual-modal branches, we design an associative-attention mechanism that includes shuffle attention enhancement module (SAE) and channel attention enhancement module (CAE). CAE focuses on the object feature and SAE observes the spatial information, both of which provide accurate features for template matching calculation. Afterwards, dual-modal classification maps and all regression maps are fused in response-level. Finally, the adaptive best score selection module (ABSS) is deployed to flexibly select prediction results in different scenarios. Experimental results on three challenging datasets indicate the effectiveness and robustness of SiamSEA, while it achieves MPR/MSR (%) and tracking speed: GTOT (90.4/73.7, 99.4fps), RGBT234 (77.2/53.8, 72.7fps) and VTUAV (69.7/55.7, 32.3fps).

Keywords