Informatics (Oct 2021)

Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model

  • Wassen Aldjanabi,
  • Abdelghani Dahou,
  • Mohammed A. A. Al-qaness,
  • Mohamed Abd Elaziz,
  • Ahmed Mohamed Helmi,
  • Robertas Damaševičius

DOI
https://doi.org/10.3390/informatics8040069
Journal volume & issue
Vol. 8, no. 4
p. 69

Abstract

Read online

As social media platforms offer a medium for opinion expression, social phenomena such as hatred, offensive language, racism, and all forms of verbal violence have increased spectacularly. These behaviors do not affect specific countries, groups, or communities only, extending beyond these areas into people’s everyday lives. This study investigates offensive and hate speech on Arab social media to build an accurate offensive and hate speech detection system. More precisely, we develop a classification system for determining offensive and hate speech using a multi-task learning (MTL) model built on top of a pre-trained Arabic language model. We train the MTL model on the same task using cross-corpora representing a variation in the offensive and hate context to learn global and dataset-specific contextual representations. The developed MTL model showed a significant performance and outperformed existing models in the literature on three out of four datasets for Arabic offensive and hate speech detection tasks.

Keywords