Temporal-stochastic tensor features for action recognition

Bojan Batalo; Lincon S. Souza; Bernardo B. Gatto; Naoya Sogi; Kazuhiro Fukui

Machine Learning with Applications (Dec 2022)

Temporal-stochastic tensor features for action recognition

Bojan Batalo,
Lincon S. Souza,
Bernardo B. Gatto,
Naoya Sogi,
Kazuhiro Fukui

Affiliations

Bojan Batalo: University of Tsukuba, Department of Computer Science, Tsukuba, Ibaraki, 305-0006, Japan; Corresponding author.
Lincon S. Souza: National Institute of Advanced Industrial Science and Technology (AIST), Koto-ku, Tokyo, 135-0064, Japan
Bernardo B. Gatto: National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki, 305-8560, Japan
Naoya Sogi: University of Tsukuba, Department of Computer Science, Tsukuba, Ibaraki, 305-0006, Japan
Kazuhiro Fukui: University of Tsukuba, Department of Computer Science, Tsukuba, Ibaraki, 305-0006, Japan

Journal volume & issue: Vol. 10
p. 100407

Abstract

Read online

In this paper, we propose Temporal-Stochastic Product Grassmann Manifold (TS-PGM), an efficient method for tensor classification in tasks such as gesture and action recognition. Our approach builds on the idea of representing tensors as points on Product Grassmann Manifold (PGM). This is achieved by mapping tensor modes to linear subspaces, where each subspace can be seen as a point on a Grassmann Manifold (GM) of the corresponding mode. Subsequently, it is possible to unify factor manifolds of respective modes in a natural way via PGM. However, this approach possibly discards discriminative information by treating all modes equally, and not considering the nature of temporal tensors such as videos. Therefore, we introduce Temporal-Stochastic Tensor features (TST features) to extract temporal information from tensors and encode them in a sequence-preserving TST subspace. These features and regular tensor modes can then be simultaneously used on PGM. Our framework addresses the problem of classification of temporal tensors while inheriting the unified mathematical interpretation of PGM because the TST subspace can be naturally integrated into PGM as a new factor manifold. Additionally, we enhance our method in two ways: (1) we improve the discrimination ability by projecting subspaces onto a Generalized Difference Subspace, and (2) we utilize kernel mapping to construct kernelized subspaces able to handle nonlinear data distribution. Experimental results on gesture and action recognition datasets show that our methods based on subspace representation with explicit TST features outperform pure spatio-temporal approaches.

Published in Machine Learning with Applications

ISSN: 2666-8270 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General): Cybernetics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/machine-learning-with-applications

About the journal

Abstract

Keywords