IEEE Access (Jan 2024)

Graph-Based Multi-Modal Multi-View Fusion for Facial Action Unit Recognition

  • Jianrong Chen,
  • Sujit Dey

DOI
https://doi.org/10.1109/ACCESS.2024.3401168
Journal volume & issue
Vol. 12
pp. 69310 – 69324

Abstract

Read online

Facial action unit (AU) detection is a crucial step in the field of affective computing and plays a crucial role in applications such as human-computer interaction, psychology, and social robotics. Despite recent advances in the field, the problem of facial AU detection remains challenging, in particular in real-world scenarios with diverse lighting conditions and head poses. This paper first presents a new, realistically challenging multi-modal and multi-view AU dataset, captured in a real-world vehicle environment. Then we introduce a novel graph-based multi-modal multi-view fusion framework, tailored for challenging environments such as those encountered in Advanced Driver-Assistance Systems (ADAS), which significantly enhances AU detection performance under these difficult conditions. Our fusion model showcases significant advancements over current single-modality methods, achieving a marked improvement in F1 scores across most AUs. Specifically, the fusion approach demonstrated a 9.0% improvement in overall average F1 scores over the best-performing single-modality model. The results validate that integrating multiple modalities and viewpoints substantially boosts the model’s robustness and accuracy under diverse conditions, offering a meaningful advancement over the state-of-the-art.

Keywords