IEEE Access (Jan 2023)

Enhancing Single Object Tracking With a Hybrid Approach: Temporal Convolutional Networks, Attention Mechanisms, and Spatial–Temporal Memory

  • Pimpa Cheewaprakobkit,
  • Chih-Yang Lin,
  • Timothy K. Shih,
  • Avirmed Enkhbat

DOI
https://doi.org/10.1109/ACCESS.2023.3330644
Journal volume & issue
Vol. 11
pp. 139211 – 139222

Abstract

Read online

Deep neural network-based tracking tasks have experienced significant advancements in recent years. However, these networks continue to face challenges in effectively adapting to appearance changes in both target and background, as well as linking objects after extended periods. The primary challenge in tracking lies in the frequent changes in a target’s appearance throughout the tracking process, which can potentially reduce tracker robustness when faced with issues such as aspect ratio changes, occlusions, scale variations, and confusion from similar objects. To address this challenge, we propose a tracking architecture that combines a temporal convolutional network (TCN) and attention mechanism with spatial-temporal memory. The TCN component empowers the model to capture temporal dependencies, while the attention mechanism reduces computational complexity by focusing on crucial regions based on context. We leverage the target’s historical information stored in the spatial-temporal memory network to guide the tracker in better adapting to target deformation. Our model attains a 67.5% average overlap (AO) on the GOT-10K dataset, a 72.1% success score (AUC) on OTB2015, a 65.8% success score (AUC) on UAV123, and achieves 59.0% accuracy on the VOT2018 dataset. These outcomes demonstrate the high effectiveness of our proposed tracker in tracking a single object.

Keywords