IEEE Access (Jan 2024)
Multimodal Sentiment Analysis of Government Information Comments Based on Contrastive Learning and Cross-Attention Fusion Networks
Abstract
Accurate identification of sentiments in government-related comments is crucial for policymakers to deeply understand public opinion, adjust policies promptly, and enhance overall satisfaction. Thus, we create a model for emotion recognition in multimodal sentiment analysis of government information comments based on contrastive learning and cross-attention fusion networks. Firstly, we collect text-image comments from Today’s Headlines App’s Politics and Law section and extract textual and visual features. We fine-tune the model with LoRA and optimize the feature representation by making low-rank adjustments to the fused features. Secondly, we utilize contrastive learning with reverse prediction to analyze intra-class and inter-class cross-modal dynamics. Then, we propose a novel fusion network that utilizes cross-attention to learn the complementary relationship between different modalities. Finally, the features are combined using the fully connected layer. The experiment illustrates that the model achieves a 96.80% accuracy in recognizing emotion polarity. Compared with the multimodal model CLIP, the accuracy of the proposed method is improved by 10.21%. The model could assist the government in emotional evolution analysis, detection of public opinion, and online public opinion guidance.
Keywords