Cyberbullying Detection and Abuser Profile Identification on Social Media for Roman Urdu

Ayesha Atif; Amna Zafar; Muhammad Wasim; Talha Waheed; Amjad Ali; Hazrat Ali; Zubair Shah

doi:10.1109/ACCESS.2024.3445288

IEEE Access (Jan 2024)

Cyberbullying Detection and Abuser Profile Identification on Social Media for Roman Urdu

Ayesha Atif,
Amna Zafar,
Muhammad Wasim,
Talha Waheed,
Amjad Ali,
Hazrat Ali,
Zubair Shah

Affiliations

Ayesha Atif: Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Pakistan
Amna Zafar: ORCiD; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Pakistan
Muhammad Wasim: ORCiD; University of Management and Technology, Sialkot Campus, Lahore, Pakistan
Talha Waheed: ORCiD; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Pakistan
Amjad Ali: ORCiD; Division of Information and Computing Technology, College of Science and Engineering (CSE), Hamad Bin Khalifa University (HBKU), Qatar Foundation, Doha, Qatar
Hazrat Ali: ORCiD; Computing Science and Mathematics, University of Stirling, Stirling, U.K.
Zubair Shah: Division of Information and Computing Technology, College of Science and Engineering (CSE), Hamad Bin Khalifa University (HBKU), Qatar Foundation, Doha, Qatar

DOI: https://doi.org/10.1109/ACCESS.2024.3445288
Journal volume & issue: Vol. 12
pp. 123339 – 123351

Abstract

Read online

In today’s digital era, the escalating phenomenon of cyberbullying is a pervasive and growing concern. With the increasing prevalence of social media platforms, such as Twitter, online abusive behavior has become a significant issue that often leads to unpleasant experiences for users. Manual detection of abnormal and bullying behavior within the realm of social media is inherently not scalable. Moreover, most existing studies on cyberbullying detection have been predominantly conducted in English and very limited work has been done on Urdu (a widely used language in Asia). This paper presents an approach for detecting cyberbullying in Roman Urdu tweets and identifying abuser profiles on Twitter. Firstly, we develop a text corpus of Roman Urdu tweets with user profile data. Subsequently, we employ Gated Recurrent Unit (GRU) model coupled with the application of word2vec technique for word embedding to develop a cyberbullying detection model. Furthermore, we present temporal abusive tweet probability analysis method to provide a nuanced analysis of the number of bullying and non-bullying tweets sent by individuals within a specific time interval. To evaluate the performance, we compare the GRU-based approach with other machine learning models. The results show that the GRU model with lexical normalization gives the best results with an accuracy of 97% and F1-measure of 97%.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords