Sensors (Mar 2023)

Multimodal Sentiment Analysis Representations Learning via Contrastive Learning with Condense Attention Fusion

  • Huiru Wang,
  • Xiuhong Li,
  • Zenyu Ren,
  • Min Wang,
  • Chunming Ma

DOI
https://doi.org/10.3390/s23052679
Journal volume & issue
Vol. 23, no. 5
p. 2679

Abstract

Read online

Multimodal sentiment analysis has gained popularity as a research field for its ability to predict users’ emotional tendencies more comprehensively. The data fusion module is a critical component of multimodal sentiment analysis, as it allows for integrating information from multiple modalities. However, it is challenging to combine modalities and remove redundant information effectively. In our research, we address these challenges by proposing a multimodal sentiment analysis model based on supervised contrastive learning, which leads to more effective data representation and richer multimodal features. Specifically, we introduce the MLFC module, which utilizes a convolutional neural network (CNN) and Transformer to solve the redundancy problem of each modal feature and reduce irrelevant information. Moreover, our model employs supervised contrastive learning to enhance its ability to learn standard sentiment features from data. We evaluate our model on three widely-used datasets, namely MVSA-single, MVSA-multiple, and HFM, demonstrating that our model outperforms the state-of-the-art model. Finally, we conduct ablation experiments to validate the efficacy of our proposed method.

Keywords