IEEE Access (Jan 2019)
Dual Exclusive Attentive Transfer for Unsupervised Deep Convolutional Domain Adaptation in Speech Emotion Recognition
Abstract
Considering different corpora of speech emotions available both publicly and privately with numerous factors that make them different, the premise of having features of both training and testing samples drawn from the same distribution and the parameterization of the same feature space is not applicable in most real world scenarios. Addressing this challenge via a domain adaptation method, we propose a dual exclusive attentive transfer (DEAT) for deep convolutional neural network architecture based on unsupervised domain adaptation setting. The proposed architecture adapts to an unshared attentive transfer procedure for convolutional adaptation of both source and target domain. Correlation alignment loss (CALLoss) is applied to minimize the domain shift through the alignment of the second-order statistics of the convolutional layer's attention maps in both domains. Then, for the proposed network to effectively model the shift dissimilar domains, we make the weights of the corresponding layers exclusive but related. The proposed model minimizes the classification loss of the source domain with labels and the correlation alignment loss of both convolutional and fully-connected layers collectively. We evaluate our architecture using the Interspeech 2009 Emotion Challenge FAU Aibo Emotion Corpus as target dataset and two publicly available corpora (ABC and Emo-DB) as source dataset. Our experimental results show that our domain adaptation method is superior to other state-of-the-art methods.
Keywords