Jisuanji kexue (Sep 2022)
Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification
Abstract
The rapid development of the Internet and the continuous expansion of social media have brought a wealth of social event information,and the task of social event classification has become increasingly challenging.Making full use of image-level and text-level information is the key to social event classification.However,most of existing methods have the following limitations:1) Most of the existing multi-modal methods have an ideal assumption that the samples of each modality are sufficient and complete,but in real applications this assumption does not always hold and there will be cases where a certain modality of events is missing;2) Most methods simply concatenate image features and text features of social events to obtain multi-modal features to classify social events.To address these challenges,this paper proposes a dual variational multi-modal attention network(DVMAN) for social event classification to address the limitations of these existing methods.In the DVMAN network,this paper proposes a novel dual variational autoencoders network to generate public representations of social events and further reconstruct the missing modal information in incomplete social event learning.Through distribution alignment and cross-reconstruction alignment,image and text latent representations are doubly aligned to mitigate the gap between different modalities,and for the mis-sing modality information,a generative model is utilized to synthesize its latent representations.In addition,this paper designs a multi-modal fusion module to integrate the fine-grained information of images and texts of social events,so as to realize the complementation and enhancement of information between modalities.This paper conducts extensive experiments on two publicly available event datasets,compared with the existing advanced methods,the accuracy of DVMAN improves by more than 4%.It demonstrates the superior performance of the proposed method for social event classification.
Keywords