Vietnam Journal of Computer Science (May 2023)

Constructing Japanese Bullying Expression Dictionary for Automated Cyberbullying Detection on Twitter

  • Jianwei Zhang,
  • Lin Li,
  • Shinsuke Nakajima

DOI
https://doi.org/10.1142/S2196888822500373
Journal volume & issue
Vol. 10, no. 02
pp. 135 – 158

Abstract

Read online

Cyberbullying has become a serious problem with the spread of personal computers, smartphones and SNS. In this paper, for automated cyberbullying detection on Twitter, we construct a Japanese bullying expression dictionary, which registers bullying words and their degrees related to bullying. The words registered in the dictionary are those that appear in the collected bullying-related tweets, and the bullying degrees attached to the words are calculated using Semantic Orientation Using Pointwise Mutual Information (SO-PMI). We also construct models to automatically classify bullying and non-bullying tweets by extracting multiple features including those drawn from the bullying expression dictionary and combining them with multiple machine learning algorithms. We evaluate the classification performance of bullying and non-bullying tweets using the constructed models. The experimental results show that the bullying expression dictionary can contribute to cyberbullying detection in most of the machine learning algorithms and that the best model can achieve an [Formula: see text]-measure value exceeding 0.9. We further investigate whether the periods of constructing bullying expression dictionaries affect the classification performance. The experimental results indicate that in contrast to the period of dictionary construction, the number of registered words has more immediate impact on classification performance.

Keywords