Recurrent neural network based multiclass cyber bullying classification

Silvia Sifath; Tania Islam; Md Erfan; Samrat Kumar Dey; MD. Minhaj Ul Islam; Md Samsuddoha; Tazizur Rahman

Natural Language Processing Journal (Dec 2024)

Recurrent neural network based multiclass cyber bullying classification

Silvia Sifath,
Tania Islam,
Md Erfan,
Samrat Kumar Dey,
MD. Minhaj Ul Islam,
Md Samsuddoha,
Tazizur Rahman

Affiliations

Silvia Sifath: Department of Computer Science and Engineering, University of Barishal, Barishal 8254, Bangladesh
Tania Islam: Department of Computer Science and Engineering, University of Barishal, Barishal 8254, Bangladesh; Corresponding author.
Md Erfan: Department of Computer Science and Engineering, University of Barishal, Barishal 8254, Bangladesh
Samrat Kumar Dey: School of Science and Technology, Bangladesh Open University, Gazipur 1705, Bangladesh
MD. Minhaj Ul Islam: Department of Computer Science and Engineering, University of Barishal, Barishal 8254, Bangladesh
Md Samsuddoha: Department of Computer Science and Engineering, University of Barishal, Barishal 8254, Bangladesh
Tazizur Rahman: Department of Management Studies, University of Barishal, Barishal 8254, Bangladesh

Journal volume & issue: Vol. 9
p. 100111

Abstract

Read online

Cyberbullying is one of the crimes that arise rapidly through the daily use of technology by different types of people and, most notably, by sharing one’s opinions or feelings on social media in a harmful manner. It has several negative effects on society such as depression, anxiety, suicide, and so on. At the same time, it reduces productivity, causes psychological damage that can last a lifetime and increases violence among people. To prevent cyberbullying or take necessary steps against the harasser, the first step is to detect cyberbullying. Several works exist to detect and classify cyberbullying but a few works have been carried out to classify cyberbullying in the Bengali Language. As the number of people is increased day by day who communicate on social media using the Bengali language, it is crucial to address this situation and improve both accuracy and robustness to detect and classify cyberbullying. For this purpose, we propose an NLP-based model using machine learning and deep learning algorithms to detect and classify Bengali comments on social media. This research specifies cyberbullying comments using a multiclass classification strategy. Kaggle and Melany are used to collect the dataset to train and evaluate our model. The dataset contains 56308 Bengali comments, consisting of four distinct categories. The categories are not bully, trolls, sexual, and threats. We use different machine learning algorithms such as Support Vector Machine, Logistic Regression, Random Forest, XGBOOST, Multinomial Naïve Bayes, Deep learning algorithm, Recurrent Neural Network (RNN), and two fusion models. Along with that effective preprocessing steps are implemented to get a suitable dataset. In this study, the Recurrent Neural Network gives the best accuracy, which is 86%. The accuracy of our model is good enough to help social media users and encourage them to practice morality.

Published in Natural Language Processing Journal

ISSN: 2949-7191 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://www.sciencedirect.com/journal/natural-language-processing-journal

About the journal

Abstract

Keywords