Behaviour recognition based on the integration of multigranular motion features in the Internet of Things

Lizong Zhang; Yiming Wang; Ke Yan; Yi Su; Nawaf Alharbe; Shuxin Feng

Digital Communications and Networks (Jun 2024)

Behaviour recognition based on the integration of multigranular motion features in the Internet of Things

Lizong Zhang,
Yiming Wang,
Ke Yan,
Yi Su,
Nawaf Alharbe,
Shuxin Feng

Affiliations

Lizong Zhang: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Yiming Wang: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Ke Yan: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China; Corresponding author.
Yi Su: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China; Beijing Institute of Remote Sensing Equipment, Beijing, 100039, China
Nawaf Alharbe: Applied College, Taibah University, Medina, 42353, Saudi Arabia
Shuxin Feng: School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China

Journal volume & issue: Vol. 10, no. 3
pp. 666 – 675

Abstract

Read online

With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices, crowdsensing systems in the Internet of Things (IoT) are now conducting complicated video analysis tasks such as behaviour recognition. These applications have dramatically increased the diversity of IoT systems. Specifically, behaviour recognition in videos usually requires a combinatorial analysis of the spatial information about objects and information about their dynamic actions in the temporal dimension. Behaviour recognition may even rely more on the modeling of temporal information containing short-range and long-range motions, in contrast to computer vision tasks involving images that focus on understanding spatial information. However, current solutions fail to jointly and comprehensively analyse short-range motions between adjacent frames and long-range temporal aggregations at large scales in videos. In this paper, we propose a novel behaviour recognition method based on the integration of multigranular (IMG) motion features, which can provide support for deploying video analysis in multimedia IoT crowdsensing systems. In particular, we achieve reliable motion information modeling by integrating a channel attention-based short-term motion feature enhancement module (CSEM) and a cascaded long-term motion feature integration module (CLIM). We evaluate our model on several action recognition benchmarks, such as HMDB51, Something-Something and UCF101. The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods, which confirms its effectiveness and efficiency.

Published in Digital Communications and Networks

ISSN: 2352-8648 (Online)
Publisher: KeAi Communications Co., Ltd.
Country of publisher: China
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.keaipublishing.com/en/journals/digital-communications-and-networks/

About the journal

Abstract

Keywords