IEEE Access (Jan 2022)

Context-Aware Deep Learning Model for Detection of Roman Urdu Hate Speech on Social Media Platform

  • Muhammad Bilal,
  • Atif Khan,
  • Salman Jan,
  • Shahrulniza Musa

DOI
https://doi.org/10.1109/ACCESS.2022.3216375
Journal volume & issue
Vol. 10
pp. 121133 – 121151

Abstract

Read online

Over the last two decades, social media platforms have grown dramatically. Twitter and Facebook are the two most popular social media platforms, with millions of active users posting billions of messages daily. These platforms allow users to have freedom of expression. However, some users exploit this facility by disseminating hate speeches. Manual detection and censorship of such hate speeches are impractical; thus, an automatic detection mechanism is required to detect and counter hate speeches in a real-time environment. Most research in hate speech detection has been carried out in the English language. Still, minimal work has been explored in other languages, mainly Urdu written in Roman Urdu script. A few research have attempted machine learning, and deep learning models for Roman Urdu hate speech detection; however, due to a scarcity of Roman Urdu resources, and a large corpus with defined annotation rules, a robust hate speech detection model is still required. With this motivation, this study contributes in the following manner: we developed annotation guidelines for Roman Urdu Hate Speech. Second, we constructed a new Roman Urdu Hate Speech Dataset (RU-HSD-30K) that was annotated by a team of experts using the annotation rules. To the best of our knowledge, the Bi-LSTM model with an attention layer for Roman-Urdu Hate Speech Detection has not been explored. Therefore, we developed a context-aware Roman Urdu Hate Speech detection model based on Bi-LSTM with an attention layer and used custom word2vec for word embeddings. Finally, we examined the effect of lexical normalization of Roman Urdu words on the performance of the proposed model. Different traditional as well as deep learning models, including LSTM and CNN models, were used as baseline models. The performance of the models was assessed in terms of evaluation metrics like accuracy, precision, recall, and F1-score. The generalization of each model is also evaluated on a cross-domain dataset. Experimental results revealed that Bi-LSTM with attention outperformed the traditional machine learning models and other deep learning models with an accuracy score of 0.875 and an F-Score of 0.885. In addition, the results demonstrated that our suggested model (Bi-LSTM with Attention Layer) is more general than previous models when applied to unseen data. The results confirmed that lexical normalization of Roman Urdu words enhanced the performance of the suggested model.

Keywords