IEEE Open Journal of Signal Processing (Jan 2024)
Zero-Shot Visual Sentiment Prediction via Cross-Domain Knowledge Distillation
Abstract
There are various sentiment theories for categorizing human sentiments into several discrete sentiment categories, which means that the theory used for training sentiment prediction methods does not always match that used in the test phase. As a solution to this problem, zero-shot visual sentiment prediction methods have been proposed to predict unseen sentiments for which no images are available in the training phase. However, the training of these previous zero-shot methods relies on a single sentiment theory, which limits their ability to handle sentiments from other theories. Thus, this article proposes a more robust zero-shot visual sentiment prediction method that can handle cross-domain sentiments defined in different sentiment theories. Specifically, by focusing on the fact that sentiments are abstract concepts common to humans regardless of whether their theories are different, we incorporate knowledge distillation into our method to construct a teacher–student model that can train the implicit relationships between sentiments defined in different sentiment theories. Furthermore, to enhance sentiment discrimination capability and strengthen the implicit relationships between sentiments, we introduce a novel sentiment loss between the teacher and student models. In this way, our model becomes robust to unseen sentiments by exploiting the implicit relationships between sentiments. The contributions of this article are the introduction of knowledge distillation and a novel sentiment loss between the teacher and student models for zero-shot visual sentiment prediction, and improved performance of zero-shot visual sentiment prediction. Experiments on several open datasets demonstrate the effectiveness of the proposed method.
Keywords