IEEE Access (Jan 2024)

Multimodal Sentiment Analysis of Government Information Comments Based on Contrastive Learning and Cross-Attention Fusion Networks

  • Guangyu Mu,
  • Chuanzhi Chen,
  • Xiurong Li,
  • Jiaxue Li,
  • Xiaoqing Ju,
  • Jiaxiu Dai

DOI
https://doi.org/10.1109/ACCESS.2024.3493933
Journal volume & issue
Vol. 12
pp. 165525 – 165538

Abstract

Read online

Accurate identification of sentiments in government-related comments is crucial for policymakers to deeply understand public opinion, adjust policies promptly, and enhance overall satisfaction. Thus, we create a model for emotion recognition in multimodal sentiment analysis of government information comments based on contrastive learning and cross-attention fusion networks. Firstly, we collect text-image comments from Today’s Headlines App’s Politics and Law section and extract textual and visual features. We fine-tune the model with LoRA and optimize the feature representation by making low-rank adjustments to the fused features. Secondly, we utilize contrastive learning with reverse prediction to analyze intra-class and inter-class cross-modal dynamics. Then, we propose a novel fusion network that utilizes cross-attention to learn the complementary relationship between different modalities. Finally, the features are combined using the fully connected layer. The experiment illustrates that the model achieves a 96.80% accuracy in recognizing emotion polarity. Compared with the multimodal model CLIP, the accuracy of the proposed method is improved by 10.21%. The model could assist the government in emotional evolution analysis, detection of public opinion, and online public opinion guidance.

Keywords