Multi-Attention Module for Dynamic Facial Emotion Recognition

Junnan Zhi; Tingting Song; Kang Yu; Fengen Yuan; Huaqiang Wang; Guangyang Hu; Hao Yang

doi:10.3390/info13050207

Information (Apr 2022)

Multi-Attention Module for Dynamic Facial Emotion Recognition

Junnan Zhi,
Tingting Song,
Kang Yu,
Fengen Yuan,
Huaqiang Wang,
Guangyang Hu,
Hao Yang

Affiliations

Junnan Zhi: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China
Tingting Song: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China
Kang Yu: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China
Fengen Yuan: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China
Huaqiang Wang: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China
Guangyang Hu: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China
Hao Yang: Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China

DOI: https://doi.org/10.3390/info13050207
Journal volume & issue: Vol. 13, no. 5
p. 207

Abstract

Read online

Video-based dynamic facial emotion recognition (FER) is a challenging task, as one must capture and distinguish tiny facial movements representing emotional changes while ignoring the facial differences of different objects. Recent state-of-the-art studies have usually adopted more complex methods to solve this task, such as large-scale deep learning models or multimodal analysis with reference to multiple sub-models. According to the characteristics of the FER task and the shortcomings of existing methods, in this paper we propose a lightweight method and design three attention modules that can be flexibly inserted into the backbone network. The key information for the three dimensions of space, channel, and time is extracted by means of convolution layer, pooling layer, multi-layer perception (MLP), and other approaches, and attention weights are generated. By sharing parameters at the same level, the three modules do not add too many network parameters while enhancing the focus on specific areas of the face, effective feature information of static images, and key frames. The experimental results on CK+ and eNTERFACE’05 datasets show that this method can achieve higher accuracy.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords