Dual Exclusive Attentive Transfer for Unsupervised Deep Convolutional Domain Adaptation in Speech Emotion Recognition

Elias Nii Noi Ocquaye; Qirong Mao; Heping Song; Guopeng Xu; Yanfei Xue

doi:10.1109/ACCESS.2019.2924597

IEEE Access (Jan 2019)

Dual Exclusive Attentive Transfer for Unsupervised Deep Convolutional Domain Adaptation in Speech Emotion Recognition

Elias Nii Noi Ocquaye,
Qirong Mao,
Heping Song,
Guopeng Xu,
Yanfei Xue

Affiliations

Elias Nii Noi Ocquaye: ORCiD; Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China
Qirong Mao: Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China
Heping Song: Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China
Guopeng Xu: Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China
Yanfei Xue: Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China

DOI: https://doi.org/10.1109/ACCESS.2019.2924597
Journal volume & issue: Vol. 7
pp. 93847 – 93857

Abstract

Read online

Considering different corpora of speech emotions available both publicly and privately with numerous factors that make them different, the premise of having features of both training and testing samples drawn from the same distribution and the parameterization of the same feature space is not applicable in most real world scenarios. Addressing this challenge via a domain adaptation method, we propose a dual exclusive attentive transfer (DEAT) for deep convolutional neural network architecture based on unsupervised domain adaptation setting. The proposed architecture adapts to an unshared attentive transfer procedure for convolutional adaptation of both source and target domain. Correlation alignment loss (CALLoss) is applied to minimize the domain shift through the alignment of the second-order statistics of the convolutional layer's attention maps in both domains. Then, for the proposed network to effectively model the shift dissimilar domains, we make the weights of the corresponding layers exclusive but related. The proposed model minimizes the classification loss of the source domain with labels and the correlation alignment loss of both convolutional and fully-connected layers collectively. We evaluate our architecture using the Interspeech 2009 Emotion Challenge FAU Aibo Emotion Corpus as target dataset and two publicly available corpora (ABC and Emo-DB) as source dataset. Our experimental results show that our domain adaptation method is superior to other state-of-the-art methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords