Art design integrating visual relation and affective semantics based on Convolutional Block Attention Mechanism-generative adversarial network model

Jiadong Shen; Jian Wang

doi:10.7717/peerj-cs.2274

PeerJ Computer Science (Aug 2024)

Art design integrating visual relation and affective semantics based on Convolutional Block Attention Mechanism-generative adversarial network model

Jiadong Shen,
Jian Wang

Affiliations

Jiadong Shen: School of Design and Art, Changsha University of Science and Technology, Changsha, Hunan, China
Jian Wang: School of Design and Art, Changsha University of Science and Technology, Changsha, Hunan, China

DOI: https://doi.org/10.7717/peerj-cs.2274
Journal volume & issue: Vol. 10
p. e2274

Abstract

Read online Read online

Scene-based image semantic extraction and its precise sentiment expression significantly enhance artistic design. To address the incongruity between image features and sentiment features caused by non-bilinear pooling, this study introduces a generative adversarial network (GAN) model that integrates visual relationships with sentiment semantics. The GAN-based regularizer is utilized during training to incorporate target information derived from the contextual information into the process. This regularization mechanism imposes stronger penalties for inaccuracies in subject-object type predictions and integrates a sentiment corpus to generate more human-like descriptive statements. The capsule network is employed to reconstruct sentences and predict probabilities in the discriminator. To preserve crucial focal points in feature extraction, the Convolutional Block Attention Mechanism (CBAM) is introduced. Furthermore, two bidirectional long short-term memory (LSTM) modules are used to model both target and relational contexts, thereby refining target labels and inter-target relationships. Experimental results highlight the model’s superiority over comparative models in terms of accuracy, BiLingual Evaluation Understudy (BLEU) score, and text preservation rate. The proposed model achieves an accuracy of 95.40% and the highest BLEU score of 16.79, effectively capturing both the label content and the emotional nuances within the image.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords