A Dynamic Position Embedding-Based Model for Student Classroom Complete Meta-Action Recognition

Zhaoyu Shou; Xiaohu Yuan; Dongxu Li; Jianwen Mo; Huibing Zhang; Jingwei Zhang; Ziyong Wu

doi:10.3390/s24165371

Sensors (Aug 2024)

A Dynamic Position Embedding-Based Model for Student Classroom Complete Meta-Action Recognition

Zhaoyu Shou,
Xiaohu Yuan,
Dongxu Li,
Jianwen Mo,
Huibing Zhang,
Jingwei Zhang,
Ziyong Wu

Affiliations

Zhaoyu Shou: School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
Xiaohu Yuan: School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
Dongxu Li: School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
Jianwen Mo: School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
Huibing Zhang: School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Jingwei Zhang: Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China
Ziyong Wu: Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China

DOI: https://doi.org/10.3390/s24165371
Journal volume & issue: Vol. 24, no. 16
p. 5371

Abstract

Read online

The precise recognition of entire classroom meta-actions is a crucial challenge for the tailored adaptive interpretation of student behavior, given the intricacy of these actions. This paper proposes a Dynamic Position Embedding-based Model for Student Classroom Complete Meta-Action Recognition (DPE-SAR) based on the Video Swin Transformer. The model utilizes a dynamic positional embedding technique to perform conditional positional encoding. Additionally, it incorporates a deep convolutional network to improve the parsing ability of the spatial structure of meta-actions. The full attention mechanism of ViT3D is used to extract the potential spatial features of actions and capture the global spatial–temporal information of meta-actions. The proposed model exhibits exceptional performance compared to baseline models in action recognition as observed in evaluations on public datasets and smart classroom meta-action recognition datasets. The experimental results confirm the superiority of the model in meta-action recognition.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords