TensorTrack: Tensor Decomposition for Video Object Tracking

Yuntao Gu; Pengfei Zhao; Lan Cheng; Yuanjun Guo; Haikuan Wang; Wenjun Ding; Yu Liu

doi:10.3390/math13040568

Mathematics (Feb 2025)

TensorTrack: Tensor Decomposition for Video Object Tracking

Yuntao Gu,
Pengfei Zhao,
Lan Cheng,
Yuanjun Guo,
Haikuan Wang,
Wenjun Ding,
Yu Liu

Affiliations

Yuntao Gu: College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
Pengfei Zhao: School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China
Lan Cheng: College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
Yuanjun Guo: Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen, Shenzhen 518055, China
Haikuan Wang: School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China
Wenjun Ding: China Construction Third Bureau Digital Engineering Co., Ltd., Shenzhen 518106, China
Yu Liu: China Construction Third Bureau Digital Engineering Co., Ltd., Shenzhen 518106, China

DOI: https://doi.org/10.3390/math13040568
Journal volume & issue: Vol. 13, no. 4
p. 568

Abstract

Read online

Video Object Tracking (VOT) is a critical task in computer vision. While Siamese-based and Transformer-based trackers are widely used in VOT, they struggle to perform well on the OTB100 benchmark due to the lack of dedicated training sets. This challenge highlights the difficulty of effectively generalizing to unknown data. To address this issue, this paper proposes an innovative method that utilizes tensor decomposition, an underexplored concept in object-tracking research. By applying L1-norm tensor decomposition, video sequences are represented as four-mode tensors, and a real-time background subtraction algorithm is introduced, allowing for effective modeling of the target–background relationship and adaptation to environmental changes, leading to accurate and robust tracking. Additionally, the paper integrates an improved multi-kernel correlation filter into a single frame, locating and tracking the target by comparing the correlation between the target template and the input image. To further enhance localization precision and robustness, the paper also incorporates Tucker2 decomposition to integrate appearance and motion patterns, generating composite heatmaps. The method is evaluated on the OTB100 benchmark dataset, showing significant improvements in both performance and speed compared to traditional methods. Experimental results demonstrate that the proposed method achieves a 15.8% improvement in AUC and a ten-fold increase in speed compared to typical deep learning-based methods, providing an efficient and accurate real-time tracking solution, particularly in scenarios with similar target–background characteristics, high-speed motion, and limited target movement.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords