Buzz Tweet Classification Based on Text and Image Features of Tweets Using Multi-Task Learning

Reishi Amitani; Kazuyuki Matsumoto; Minoru Yoshida; Kenji Kita

doi:10.3390/app112210567

Applied Sciences (Nov 2021)

Buzz Tweet Classification Based on Text and Image Features of Tweets Using Multi-Task Learning

Reishi Amitani,
Kazuyuki Matsumoto,
Minoru Yoshida,
Kenji Kita

Affiliations

Reishi Amitani: Graduate School of Sciences and Technology for Innovation, Tokushima University, Tokushima 770-8506, Japan
Kazuyuki Matsumoto: Graduate School of Sciences and Technology for Innovation, Tokushima University, Tokushima 770-8506, Japan
Minoru Yoshida: Graduate School of Sciences and Technology for Innovation, Tokushima University, Tokushima 770-8506, Japan
Kenji Kita: Graduate School of Sciences and Technology for Innovation, Tokushima University, Tokushima 770-8506, Japan

DOI: https://doi.org/10.3390/app112210567
Journal volume & issue: Vol. 11, no. 22
p. 10567

Abstract

Read online

This study investigates social media trends and proposes a buzz tweet classification method to explore the factors causing the buzz phenomenon on Twitter. It is difficult to identify the causes of the buzz phenomenon based solely on texts posted on Twitter. It is expected that by limiting the tweets to those with attached images and using the characteristics of the images and the relationships between the text and images, a more detailed analysis than that of with text-only tweets can be conducted. Therefore, an analysis method was devised based on a multi-task neural network that uses both the features extracted from the image and text as input and the buzz class (buzz/non-buzz) and the number of “likes (favorites)” and “retweets (RTs)” as output. The predictions made using a single feature of the text and image were compared with the predictions using a combination of multiple features. The differences between buzz and non-buzz features were analyzed based on the cosine similarity between the text and the image. The buzz class was correctly identified with a correctness rate of approximately 80% for all combinations of image and text features, with the combination of BERT and VGG16 providing the highest correctness rate.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords