Jisuanji kexue (Mar 2023)

Multimodal Sentiment Analysis Based on Adaptive Gated Information Fusion

  • CHEN Zhen, PU Yuanyuan, ZHAO Zhengpeng, XU Dan, QIAN Wenhua

DOI
https://doi.org/10.11896/jsjkx.220100156
Journal volume & issue
Vol. 50, no. 3
pp. 298 – 306

Abstract

Read online

The goal of multimodal sentiment analysis is to achieve reliable and robust sentiment analysis by utilizing complementary information provided by multiple modalities.Recently,extracting deep semantic features by neural networks has achieved remarkable results in multimodal sentiment analysis.But the fusion of features at different levels of multimodal information is also an important part in determining the effectiveness of sentiment analysis.Thus,a multimodal sentiment analysis model based on adaptive gating information fusion(AGIF) is proposed.Firstly,the different levels of visual and color features extracted by swin transformer and ResNet are organically fused through a gated information fusion network based on their contribution to sentiment analysis.Secondly,the sentiment of an image is often expressed by multiple subtle local regions due to the abstraction and complexity of sentiment,and these sentiment discriminating regions can be located accurately by iterative attention based on past information.The latest ERNIE pre-training model is utilized to solve the problem of Word2Vec and GloVe's inability to handle the word polysemy.Finally,the auto-fusion network is utilized to “dynamically” fuse the features of each modality,solving the pro-blem of information redundancy caused by the deterministic operation(concatenation or TFN) to construct multimodal joint representation.Extensive experiments on three publicly available real datasets demonstrate the effectiveness of the proposed model.

Keywords