Jisuanji kexue yu tansuo (Jun 2021)

Spatio-Temporal Correlation Based Adaptive Feature Learning of Tracking Object

  • GUO Mingzhe, CAI Zixin, WANG Xinyue, JING Liping, YU Jian

DOI
https://doi.org/10.3778/j.issn.1673-9418.2007002
Journal volume & issue
Vol. 15, no. 6
pp. 1049 – 1061

Abstract

Read online

Object tracking has been a difficult problem in the field of vision in recent years. The core task is to continuously locate an object in video sequences and mark its location with bounding boxes. Most of the existing tracking methods use the idea of object detection, and separate the video sequence by frame to detect the target separately. Although this strategy makes full use of the current frame information, it ignores the spatio-temporal correlation information among frames. However, the spatio-temporal correlation information is the key of adapting to the change of the target??s appearance and fully representing the target. To solve this problem, this paper proposes a spatio-temporal siamese network (STSiam) based on spatio-temporal correlation. STSiam uses the spatio-temporal correlation information for target locating and real-time tracking in two stages: object localization and object repre-sentation. In the stage of object localization, STSiam adaptively captures the features of the target and its surroun-ding area, and updates the target matching template to ensure that it is not affected by appearance changes. In the stage of object representation, STSiam pays attention to the spatial correlation information between corresponding regions in different frames. By using the object localization, STSiam locates the target area and learns the target bounding box correction parameters to ensure that the bounding box fits the target as closely as possible. The model's network architecture is based on offline training, and it is no need to update model parameters during online tracking to ensure its real-time tracking speed. Extensive experiments on visual tracking benchmarks including OTB2015, VOT2016, VOT2018 and LaSOT demonstrate that STSiam achieves state-of-the-art performance in terms of accu-racy, robustness and speed compared with existing methods.

Keywords