Mathematics (Feb 2025)

TensorTrack: Tensor Decomposition for Video Object Tracking

  • Yuntao Gu,
  • Pengfei Zhao,
  • Lan Cheng,
  • Yuanjun Guo,
  • Haikuan Wang,
  • Wenjun Ding,
  • Yu Liu

DOI
https://doi.org/10.3390/math13040568
Journal volume & issue
Vol. 13, no. 4
p. 568

Abstract

Read online

Video Object Tracking (VOT) is a critical task in computer vision. While Siamese-based and Transformer-based trackers are widely used in VOT, they struggle to perform well on the OTB100 benchmark due to the lack of dedicated training sets. This challenge highlights the difficulty of effectively generalizing to unknown data. To address this issue, this paper proposes an innovative method that utilizes tensor decomposition, an underexplored concept in object-tracking research. By applying L1-norm tensor decomposition, video sequences are represented as four-mode tensors, and a real-time background subtraction algorithm is introduced, allowing for effective modeling of the target–background relationship and adaptation to environmental changes, leading to accurate and robust tracking. Additionally, the paper integrates an improved multi-kernel correlation filter into a single frame, locating and tracking the target by comparing the correlation between the target template and the input image. To further enhance localization precision and robustness, the paper also incorporates Tucker2 decomposition to integrate appearance and motion patterns, generating composite heatmaps. The method is evaluated on the OTB100 benchmark dataset, showing significant improvements in both performance and speed compared to traditional methods. Experimental results demonstrate that the proposed method achieves a 15.8% improvement in AUC and a ten-fold increase in speed compared to typical deep learning-based methods, providing an efficient and accurate real-time tracking solution, particularly in scenarios with similar target–background characteristics, high-speed motion, and limited target movement.

Keywords