IEEE Access (Jan 2024)
Cyberbullying Detection and Abuser Profile Identification on Social Media for Roman Urdu
Abstract
In today’s digital era, the escalating phenomenon of cyberbullying is a pervasive and growing concern. With the increasing prevalence of social media platforms, such as Twitter, online abusive behavior has become a significant issue that often leads to unpleasant experiences for users. Manual detection of abnormal and bullying behavior within the realm of social media is inherently not scalable. Moreover, most existing studies on cyberbullying detection have been predominantly conducted in English and very limited work has been done on Urdu (a widely used language in Asia). This paper presents an approach for detecting cyberbullying in Roman Urdu tweets and identifying abuser profiles on Twitter. Firstly, we develop a text corpus of Roman Urdu tweets with user profile data. Subsequently, we employ Gated Recurrent Unit (GRU) model coupled with the application of word2vec technique for word embedding to develop a cyberbullying detection model. Furthermore, we present temporal abusive tweet probability analysis method to provide a nuanced analysis of the number of bullying and non-bullying tweets sent by individuals within a specific time interval. To evaluate the performance, we compare the GRU-based approach with other machine learning models. The results show that the GRU model with lexical normalization gives the best results with an accuracy of 97% and F1-measure of 97%.
Keywords