IEEE Access (Jan 2024)
A Novel Attention Residual Network Expression Recognition Method
Abstract
Expressions serve as intuitive reflections of a person’s psychological state, making the extraction of effective features for accurate facial expression recognition a crucial research problem. However, when facial information is incomplete, the existing convolutional neural networks face some challenges in extracting features. To address this issue, this paper introduces a pyramidal convolutional attention residual network(PCARNet) based on the ResNet18. PCARNet combines the pyramidal convolution module and an improved convolutional attention mechanism to effectively extract expression features and achieve high-precision facial expression recognition. The proposed model utilizes pyramidal convolution to extract facial expression features at multiple scales, capturing both global and local information of the face. Grouped convolution is employed to reduce the computational complexity and the number of parameters. Additionally, to avoid the adverse effects of channel dimensionality reduction on the attention mechanism and enhance the capacity for information exchange across channels, the Share MLP module within the convolutional attention mechanism was replaced by a one-dimensional convolution with adaptive kernel size. The improved convolutional attention mechanism assigns weights to the extracted multiscale features based on both channel and spatial dimensions, enhancing the representation of crucial facial features. Experimental results demonstrate the high recognition accuracy of the proposed method on public datasets such as Fer2013, RAF-DB, and CK+. The accuracies achieved are 73.725%, 87.516%, and 95.455%, respectively. Compared to other methods, the proposed approach shows improvements of at least 1.4%, 2.4%, and 0.25% on the respective datasets, confirming its high reliability and performance.
Keywords