Mathematics (Mar 2023)
OFPI: Optical Flow Pose Image for Action Recognition
Abstract
Most approaches to action recognition based on pseudo-images involve encoding skeletal data into RGB-like image representations. This approach cannot fully exploit the kinematic features and structural information of human poses, and convolutional neural network (CNN) models that process pseudo-images lack a global field of view and cannot completely extract action features from pseudo-images. In this paper, we propose a novel pose-based action representation method called Optical Flow Pose Image (OFPI) in order to fully capitalize on the spatial and temporal information of skeletal data. Specifically, in the proposed method, an advanced pose estimator collects skeletal data before locating the target person and then extracts skeletal data utilizing a human tracking algorithm. The OFPI representation is obtained by aggregating these skeletal data over time. To test the superiority of OFPI and investigate the significance of the model having a global field of view, we trained a simple CNN model and a transformer-based model, respectively. Both models achieved superior outcomes. Because of the global field of view, especially in the transformer-based model, the OFPI-based representation achieved 98.3% and 94.2% accuracy on the KTH and JHMDB datasets, respectively. Compared with other advanced pose representation methods and multi-stream methods, OFPI achieved state-of-the-art performance on the JHMDB dataset, indicating the utility and potential of this algorithm for skeleton-based action recognition research.
Keywords