Body-Part-Aware and Multitask-Aware Single-Image-Based Action Recognition

Bhishan Bhandari; Geonu Lee; Jungchan Cho

doi:10.3390/app10041531

Applied Sciences (Feb 2020)

Body-Part-Aware and Multitask-Aware Single-Image-Based Action Recognition

Bhishan Bhandari,
Geonu Lee,
Jungchan Cho

Affiliations

Bhishan Bhandari: College of Information Technology, Gachon University, Seongnam 13120, Korea
Geonu Lee: College of Information Technology, Gachon University, Seongnam 13120, Korea
Jungchan Cho: College of Information Technology, Gachon University, Seongnam 13120, Korea

DOI: https://doi.org/10.3390/app10041531
Journal volume & issue: Vol. 10, no. 4
p. 1531

Abstract

Read online

Action recognition is an application that, ideally, requires real-time results. We focus on single-image-based action recognition instead of video-based because of improved speed and lower cost of computation. However, a single image contains limited information, which makes single-image-based action recognition a difficult problem. To get an accurate representation of action classes, we propose three feature-stream-based shallow sub-networks (image-based, attention-image-based, and part-image-based feature networks) on the deep pose estimation network in a multitasking manner. Moreover, we design the multitask-aware loss function, so that the proposed method can be adaptively trained with heterogeneous datasets where only human pose annotations or action labels are included (instead of both pose and action information), which makes it easier to apply the proposed approach to new data on behavioral analysis on intelligent systems. In our extensive experiments, we showed that these streams represent complementary information and, hence, the fused representation is robust in distinguishing diverse fine-grained action classes. Unlike other methods, the human pose information was trained using heterogeneous datasets in a multitasking manner; nevertheless, it achieved 91.91% mean average precision on the Stanford 40 Actions Dataset. Moreover, we demonstrated the proposed method can be flexibly applied to multi-labels action recognition problem on the V-COCO Dataset.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords