IEEE Access (Jan 2024)

MM DialogueGAT—A Fusion Graph Attention Network for Emotion Recognition Using Multi-Model System

  • Rui Fu,
  • Xiaomei Gai,
  • Ahmed Abdulhakim Al-Absi,
  • Mohammed Abdulhakim Al-Absi,
  • Muhammad Alam,
  • Ye Li,
  • Meng Jiang,
  • Xuewei Wang

DOI
https://doi.org/10.1109/ACCESS.2024.3350156
Journal volume & issue
Vol. 12
pp. 150941 – 150952

Abstract

Read online

Emotion recognition is an important part of human-computer interaction and human communication information is multi-model. Despite advancements in emotion recognition models, certain challenges persist. The first problem (Problem 1) pertains to the predominant focus in existing research on mining the interaction information between modes and the context information in the dialogue process but neglects to mine the role information between multi-model states and context information in the dialogue process. The second problem (Problem 2) is in the context information of the dialogue where the information is not completely transmitted in a temporal structure. Aiming at these two problems, we propose a multi-model fusion dialogue graph attention network (MM DialogueGAT). To solve the problem 1, the bidirectional GRU mechanism is used to extract the information from each model. In the multi-model information fusion problem, different model configurations and different combinations use the cross-model multi-head attention mechanism to establish a multi-head attention layer. Text, video and audio information are used as the main and auxiliary modes for information fusion. To solve the problem 2, in the temporal context information extraction problem, the GAT graph structure is used to capture the context information in the mode. The results show that our model achieves good results using the IMEOCAP datasets.

Keywords