IEEE Access (Jan 2024)
A Unified Model for Style Classification and Emotional Response Analysis
Abstract
The emergence of Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) has markedly transformed the field of image classification and analysis, especially within the realm of computer vision. This advancement has significantly impacted various sectors, including medical diagnostics and autonomous driving, while also fostering novel intersections with artistic exploration. Despite these advancements, the challenge of seamlessly integrating art style classification with emotion prediction remains. The complex interplay between an artwork’s style and the emotional reactions it triggers requires a refined methodology to accurately encapsulate this dynamic relationship. Addressing this challenge, our study presents a Unified Model for Art Style and Emotion Prediction (ASE), which adopts a multi-task learning approach. This model is structured around three main elements: Artwork Style Classification, Emotion Prediction for viewers of art, and a Task-Specific Attention Module. By incorporating a pre-trained image encoder alongside a task-specific attention mechanism, our framework facilitates the concurrent processing of multiple tasks, while honing in on specialized feature representations. The efficacy of our model is validated through the Artemis dataset, demonstrating its proficiency in both precise art style classification and the identification of emotional responses. This highlights its capability to navigate the complex relationships present within artworks effectively.
Keywords