Zhejiang Daxue xuebao. Lixue ban (Jan 2025)
Micro-expression recognition based on self-supervised masked optical flow(基于自监督掩码光流的人脸微表情识别)
Abstract
In the field of micro-expression recognition, existing feature extraction methods primarily rely on supervised learning techniques, which limits the model's generalizability on feature representation and their application scenarios. To address this issue, this study leverages masked autoencoder (MAE) to develop a self-supervised feature encoder suitable for micro-expression datasets, and applies it to micro-expression recognition in downstream tasks. Firstly, the methodology utilizes the optical flow maps calculated between the four adjacent of frames the apex frame (two on each side) and the onset frame in the micro-expression datasets. Subsequently, resampling technique is employed to balance the samples of different categories. Finally, the augmented dataset is used to pre-train the SMOF model. This method randomly masks partial local patches of the input image and reconstructs the missing pixels in these patches, demonstrating its ability to recover the original information from partial features, thus showcasing strong feature representation capabilities. The pre-trained feature extractor is loaded into the ViT-Large model in the downstream task, achieving good performance in micro-expression recognition. The overall network performance is evaluated on the 3DB database using the LOSO strategy. The micro-expression recognition method based on SMOF achieves a UF1 score of 0.861 4 and a UAR score of 0.871 6, outperforming other deep learning methods for micro-expression recognition. Extensive experimental results demonstrate that a pre-trained model suitable for micro-expression feature extraction using the self-supervised masked optical flow method is feasible and effective. This approach not only enhances the performance of micro-expression recognition but also exhibits superior transferability and scalability.(在人脸微表情识别研究领域,现有特征提取方法主要依赖于有监督学习方式,限制了模型在微表情表征中的应用场景和泛化能力。为此,利用掩码自编码器(MAE)技术构建了面向微表情数据集特点的自监督特征编码器,并将其应用于下游微表情识别任务,提出了一种基于自监督掩码光流(self-supervised masked optical flow,SMOF)的人脸微表情识别方法。首先,使用微表情数据集中每个顶点帧的左右各两帧与起始帧计算的光流图作为初步输入。然后,使用重采样技术平衡数据集中各类样本。最后,利用增广后的数据集预训练SMOF模型,通过随机掩盖输入图像的局部块,重构缺失块的像素。在下游任务中,使用ViT-Large模型加载已训练好的特征提取器,取得了较好的微表情分类结果。采用留一受试者交叉验证(LOSO)方法,评估了模型在3DB微表情数据集上的性能,未加权F1分数(UF1)和未加权平均召回率(UAR)分别达0.861 4和0.871 6,优于其他深度学习微表情识别方法。实验结果表明,基于SMOF的微表情识别方法可行且有效,具有较好的可迁移性和扩展性。)
Keywords