Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

Ehtesham Hashmi; Sule Yildirim Yayilgan

doi:10.1007/s40747-024-01392-5

Complex & Intelligent Systems (Mar 2024)

Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

Ehtesham Hashmi,
Sule Yildirim Yayilgan

Affiliations

Ehtesham Hashmi: Department of Information Security and Communication Technology (IIK), Norwegian University of Science and Technology (NTNU)
Sule Yildirim Yayilgan: Department of Information Security and Communication Technology (IIK), Norwegian University of Science and Technology (NTNU)

DOI: https://doi.org/10.1007/s40747-024-01392-5
Journal volume & issue: Vol. 10, no. 3
pp. 4535 – 4556

Abstract

Read online

Abstract The growth of social networks has provided a platform for individuals with prejudiced views, allowing them to spread hate speech and target others based on their gender, ethnicity, religion, or sexual orientation. While positive interactions within diverse communities can considerably enhance confidence, it is critical to recognize that negative comments can hurt people’s reputations and well-being. This emergence emphasizes the need for more diligent monitoring and robust policies on these platforms to protect individuals from such discriminatory and harmful behavior. Hate speech is often characterized as an intentional act of aggression directed at a specific group, typically meant to harm or marginalize them based on certain aspects of their identity. Most of the research related to hate speech has been conducted in resource-aware languages like English, Spanish, and French. However, low-resource European languages, such as Irish, Norwegian, Portuguese, Polish, Slovak, and many South Asian, present challenges due to limited linguistic resources, making information extraction labor-intensive. In this study, we present deep neural networks with FastText word embeddings using regularization methods for multi-class hate speech detection in the Norwegian language, along with the implementation of multilingual transformer-based models with hyperparameter tuning and generative configuration. FastText outperformed other deep learning models when stacked with Bidirectional LSTM and GRU, resulting in the FAST-RNN model. In the concluding phase, we compare our results with the state-of-the-art and perform interpretability modeling using Local Interpretable Model-Agnostic Explanations to achieve a more comprehensive understanding of the model’s decision-making mechanisms.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords