Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2023)

Transformer-Based Dual-Modal Visual Target Tracking Using Visible Light and Thermal Infrared

  • Pengfei Lyu

DOI
https://doi.org/10.23919/FRUCT58615.2023.10143052
Journal volume & issue
Vol. 33, no. 1
pp. 176 – 184

Abstract

Read online

Visual target tracking is an essential technology with numerous applications, including video surveillance, motion recognition, and autonomous driving. However, tracking accuracy can be affected in challenging scenarios, such as low-light conditions and occlusion, which make it difficult to extract effective tracking features from a single visible light image. On the other hand, infrared images can penetrate occlusion and are insensitive to light. However, tracking using only infrared images can be influenced by thermal crosstalk and lacks detailed texture information. Therefore, the RGB-T tracking method, which combines visible light and thermal infrared images, can significantly enhance the accuracy and robustness of object tracking in challenging scenarios, especially in autonomous driving. We propose a transformer-based fusion tracker that utilizes dual-modal information and combines test and training branches with target encoding for global reasoning across frames. The proposed method successfully achieves target tracking using visible light and thermal infrared images. The experimental results on the public benchmark show that the proposed tracker has higher overall performance and can meet the requirements for precise and robust tracking of vulnerable road users in autonomous driving tasks.

Keywords