IEEE Access (Jan 2024)

MLHS-CGCapNet: A Lightweight Model for Multilingual Hate Speech Detection

  • Abida Kousar,
  • Jameel Ahmad,
  • Khalid Ijaz,
  • Amr Yousef,
  • Zaffar Ahmed Shaikh,
  • Ikramullah Khosa,
  • Durga Chavali,
  • Mohd Anjum

DOI
https://doi.org/10.1109/ACCESS.2024.3434664
Journal volume & issue
Vol. 12
pp. 106631 – 106644

Abstract

Read online

The rapid advancement of computer technology and the widespread adoption of online social media platforms have inadvertently provided fertile ground for individuals with antisocial inclinations to thrive, ushering in a range of security concerns, including the proliferation of fake profiles, hate speech, social bots, and the spread of unfounded rumors. Among these issues, a prominent concern is the prevalence of hate speech within online social networks (OSNs). However, the relevance of numerous studies on hate speech detection has been limited, as they primarily focus on a single language, often English. In response, our research embarks on an exhaustive exploration of multilingual hate speech across 12 distinct languages, offering a novel approach by adapting hate speech detection resources across linguistic boundaries. This study presents the development of a robust, lightweight and multilingual hate speech detection model, known as MLHS-CGCapNet, which combines convolutional and bidirectional gated recurrent units with a capsule network. With commendable accuracy, recall and f-score values of 0.89, 0.80, and 0.84, respectively, our proposed model exhibits strong performance, even when handling an imbalanced dataset. Notably, during the training and validation phases, the suggested model showcases exceptional effectiveness, achieving accuracy values of 0.93 and 0.90, respectively, particularly in the challenging context of imbalanced data. In comparison to both baseline and state-of-the-art techniques, our model offers superior performance.

Keywords