Complex & Intelligent Systems (Dec 2023)

Visual sentiment analysis with semantic correlation enhancement

  • Hao Zhang,
  • Yanan Liu,
  • Zhaoyu Xiong,
  • Zhichao Wu,
  • Dan Xu

DOI
https://doi.org/10.1007/s40747-023-01296-w
Journal volume & issue
Vol. 10, no. 2
pp. 2869 – 2881

Abstract

Read online

Abstract Visual sentiment analysis is in great demand as it provides a computational method to recognize sentiment information in abundant visual contents from social media sites. Most of existing methods use CNNs to extract varying visual attributes for image sentiment prediction, but they failed to comprehensively consider the correlation among visual components, and are limited by the receptive field of convolutional layers as a result. In this work, we propose a visual semantic correlation network VSCNet, a Transformer-based visual sentiment prediction model. Precisely, global visual features are captured through an extended attention network stacked by a well-designed extended attention mechanism like Transformer. An off-the-shelf object query tool is used to determine the local candidates of potential affective regions, by which redundant and noisy visual proposals are filtered out. All candidates considered affective are embedded into a computable semantic space. Finally, a fusion strategy integrates semantic representations and visual features for sentiment analysis. Extensive experiments reveal that our method outperforms previous studies on 5 annotated public image sentiment datasets without any training tricks. More specifically, it achieves 1.8% higher accuracy on FI benchmark compared with other state-of-the-art methods.

Keywords