Remote Sensing (Feb 2023)

TMTNet: A Transformer-Based Multimodality Information Transfer Network for Hyperspectral Object Tracking

  • Chunhui Zhao,
  • Hongjiao Liu,
  • Nan Su,
  • Congan Xu,
  • Yiming Yan,
  • Shou Feng

DOI
https://doi.org/10.3390/rs15041107
Journal volume & issue
Vol. 15, no. 4
p. 1107

Abstract

Read online

Hyperspectral video with spatial and spectral information has great potential to improve object tracking performance. However, the limited hyperspectral training samples hinder the development of hyperspectral object tracking. Since hyperspectral data has multiple bands, from which any three bands can be extracted to form pseudocolor images, we propose a Transformer-based multimodality information transfer network (TMTNet), aiming to improve the tracking performance by efficiently transferring the information of multimodality data composed of RGB and hyperspectral in the hyperspectral tracking process. The multimodality information needed to be transferred mainly includes the RGB and hyperspectral multimodality fusion information and the RGB modality information. Specifically, we construct two subnetworks to transfer the multimodality fusion information and the robust RGB visual information, respectively. Among them, the multimodality fusion information transfer subnetwork is designed based on the dual Siamese branch structure. The subnetwork employs the pretrained RGB tracking model as the RGB branch to guide the training of the hyperspectral branch with little training samples. The RGB modality information transfer subnetwork is designed based on a pretrained RGB tracking model with good performance to improve the tracking network’s generalization and accuracy in unknown complex scenes. In addition, we design an information interaction module based on Transformer in the multimodality fusion information transfer subnetwork. The module can fuse multimodality information by capturing the potential interaction between different modalities. We also add a spatial optimization module to TMTNet, which further optimizes the object position predicted by the subject network by fully retaining and utilizing detailed spatial information. Experimental results on the only available hyperspectral tracking benchmark dataset show that the proposed TMTNet tracker outperforms the advanced trackers, demonstrating the effectiveness of this method.

Keywords