IEEE Access (Jan 2017)
Active Trace: A Sparse Spatiotemporal Representation for Videos
Abstract
This paper proposes a sparse video representation with a deformable spatiotemporal template feature, named as active trace template. An active trace is the motion track of an active spatial feature, which moves in a certain velocity. To accommodate geometric variations of the spatial feature and motion variations of the temporal track, the atomic spatial feature in each video frame is capable to slightly shift its location and other attributes within certain ranges to best represent the salient trackable structure. The representation quality is quantified by a spatiotemporal score. It is computed through a new proposed spatiotemporal hierarchical architecture of sum-max maps. Based on the score, a small number of best active trace templates are selected from all the trace candidates to depict the video sketch. The experiments demonstrate that for natural videos, the proposed model is able to provide an intuitive and sparse representation, which matches human vision as well as reveals the spatiotemporal correspondence along consecutive frames even in challenging situations, such as occlusion. Furthermore, it shows the potential on dealing with high level vision tasks by moving object detection and segmentation, and action template learning and representation.
Keywords