IEEE Access (Jan 2025)

Novel Speech-Based Emotion Climate Recognition in Peers’ Conversations Incorporating Affect Dynamics and Temporal Convolutional Neural Networks

  • Ghada Alhussein,
  • Mohanad Alkhodari,
  • Ahsan H. Khandoker,
  • Leontios J. Hadjileontiadis

DOI
https://doi.org/10.1109/ACCESS.2025.3529125
Journal volume & issue
Vol. 13
pp. 16752 – 16769

Abstract

Read online

Peers’ conversation provides a domain of rich emotional information. The latter, apart from facial and gestural expressions, it is also naturally conveyed via peers’ speech, contributing to the establishment of a dynamic emotion climate (EC) during their conversational interaction. Recognition of EC could provide an additional source in understating peers’ social interaction and behavior on top of peers’ actual conversational content.We propose a novel approach for speech-based EC recognition, termed AffECt, which combines peers’ complex affect dynamics (AD) with deep features extracted from speech signals using Temporal Convolutional Neural Networks (TCNNs). AffECt was tested and cross-validated on data drawn from there open datasets, i.e., K-EmoCon, IEMOCAP, and SEWA, in terms of EC arousal/valence level classification. The experimental results have shown that AffECt achieves EC classification accuracy up to 83.3% and 80.2% for arousal and valence, respectively, clearly surpassing the results reported in the literature, exhibiting robust performance across different languages. Moreover, there is a distinct improvement when the AD are combined with the TCNN, compared to the baseline deep learning approaches. These results demonstrate the effectiveness of AffECt in speech-based EC recognition, paving the way for many applications, e.g., in patients’ group therapy, negotiations, and emotion-aware mobile applications.

Keywords