IEEE Access (Jan 2023)

Unleashing the Transferability Power of Unsupervised Pre-Training for Emotion Recognition in Masked and Unmasked Facial Images

  • Moreno D'inca,
  • Cigdem Beyan,
  • Radoslaw Niewiadomski,
  • Simone Barattin,
  • Nicu Sebe

DOI
https://doi.org/10.1109/ACCESS.2023.3308047
Journal volume & issue
Vol. 11
pp. 90876 – 90890

Abstract

Read online

Facial expressions are an essential part of nonverbal communication and major indicators of human emotions. Effective automatic Facial Emotion Recognition (FER) systems can facilitate comprehension of an individual’s intention, and prospective behaviors in Human-Computer and Human-Robot Interaction. However, FER faces an enduring challenge, commonly encountered in real-life, of partial occlusions caused by objects such as sunglasses and hands. With the onset of the COVID-19 pandemic, facial masks become a major obstruction for FER systems. The utilization of facial masks exacerbates the occlusion issue since these cover a significant portion of a person’s face, including the highly informative mouth area from which positive and negative emotions can be differentiated. Conversely, the efficacy of FER is largely contingent upon the supervised learning paradigm, which necessitates costly and laborious data annotation. Our study centers on utilizing the reconstruction capability of a Convolutional Residual Autoencoder to differentiate between positive and negative emotions. The proposed approach employs unsupervised feature learning and takes as inputs facial images of individuals with masks and without masks. Our study puts particular emphasis on the transferability of the proposed approach to different domains in comparison to current state-of-the-art fully supervised methods. The comprehensive experimental evaluation demonstrates the superior transferability of the proposed approach, highlighting the effectiveness of the unsupervised feature learning pipeline. Despite outperforming more complex methods in some scenarios, the proposed approach is characterized by relatively low computational expense. The source code of the proposed approach, along with the facial images created for this study, are accessible in HERE.

Keywords