Contextual Action Cues from Camera Sensor for Multi-Stream Action Recognition

Jongkwang Hong; Bora Cho; Yong  Won Hong; Hyeran Byun

doi:10.3390/s19061382

Sensors (Mar 2019)

Contextual Action Cues from Camera Sensor for Multi-Stream Action Recognition

Jongkwang Hong,
Bora Cho,
Yong Won Hong,
Hyeran Byun

Affiliations

Jongkwang Hong: Department of Computer Science, Yonsei University, Seoul 03722, Korea
Bora Cho: Department of Computer Science, Yonsei University, Seoul 03722, Korea
Yong Won Hong: Department of Computer Science, Yonsei University, Seoul 03722, Korea
Hyeran Byun: Department of Computer Science, Yonsei University, Seoul 03722, Korea

DOI: https://doi.org/10.3390/s19061382
Journal volume & issue: Vol. 19, no. 6
p. 1382

Abstract

Read online

In action recognition research, two primary types of information are appearance and motion information that is learned from RGB images through visual sensors. However, depending on the action characteristics, contextual information, such as the existence of specific objects or globally-shared information in the image, becomes vital information to define the action. For example, the existence of the ball is vital information distinguishing “kicking„ from “running„. Furthermore, some actions share typical global abstract poses, which can be used as a key to classify actions. Based on these observations, we propose the multi-stream network model, which incorporates spatial, temporal, and contextual cues in the image for action recognition. We experimented on the proposed method using C3D or inflated 3D ConvNet (I3D) as a backbone network, regarding two different action recognition datasets. As a result, we observed overall improvement in accuracy, demonstrating the effectiveness of our proposed method.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords