Machine Learning with Applications (Dec 2022)
Temporal-stochastic tensor features for action recognition
Abstract
In this paper, we propose Temporal-Stochastic Product Grassmann Manifold (TS-PGM), an efficient method for tensor classification in tasks such as gesture and action recognition. Our approach builds on the idea of representing tensors as points on Product Grassmann Manifold (PGM). This is achieved by mapping tensor modes to linear subspaces, where each subspace can be seen as a point on a Grassmann Manifold (GM) of the corresponding mode. Subsequently, it is possible to unify factor manifolds of respective modes in a natural way via PGM. However, this approach possibly discards discriminative information by treating all modes equally, and not considering the nature of temporal tensors such as videos. Therefore, we introduce Temporal-Stochastic Tensor features (TST features) to extract temporal information from tensors and encode them in a sequence-preserving TST subspace. These features and regular tensor modes can then be simultaneously used on PGM. Our framework addresses the problem of classification of temporal tensors while inheriting the unified mathematical interpretation of PGM because the TST subspace can be naturally integrated into PGM as a new factor manifold. Additionally, we enhance our method in two ways: (1) we improve the discrimination ability by projecting subspaces onto a Generalized Difference Subspace, and (2) we utilize kernel mapping to construct kernelized subspaces able to handle nonlinear data distribution. Experimental results on gesture and action recognition datasets show that our methods based on subspace representation with explicit TST features outperform pure spatio-temporal approaches.