IEEE Access (Jan 2024)
BCD-MM: Multimodal Sentiment Analysis Model With Dual-Bias-Aware Feature Learning and Attention Mechanisms
Abstract
Multimodal Sentiment Analysis (MSA) is gaining attention, but faces two main challenges: efficient extraction of cross-modal features without redundancy and removing spurious correlations between sentiment labels and multimodal features. In this paper, we propose a novel multimodal learning debiasing model, named Bilateral Cross-modal Debias Multimodal sentiment analysis Model (BCD-MM), to address these issues. Specifically, BCD-MM ultimately enhances the generalisation of the model to out-of-distribution (OOD) situations by improving the ability of cross-modal low-redundancy feature extraction and reducing the reliance on non-causal correlations. First, BCD-MM utilizes an attention score-based method to preserve critical information and eliminate redundancy within modalities. It also employs a gated crossmodal attention mechanism to filter inconsistencies through modal interaction, thereby enhancing the extraction of cross-modal specific features. Second, BCD-MM incorporates a debiasing approach with double bias extraction, using a Tanh-based Mean Absolute Error (TMAE) loss function and inverse probability weighting to mitigate spurious correlations. Finally, extensive testing on three public datasets (MOSI, MOSEI, and SIMS) and two OOD datasets (OOD MOSI and OOD MOSEI) demonstrates our model’s effectiveness in both MSA and debiasing tasks.
Keywords