Comparison between Machine Learning and Deep Learning Approaches for the Detection of Toxic Comments on Social Networks

Andrea Bonetti; Marcelino Martínez-Sober; Julio C. Torres; Jose M. Vega; Sebastien Pellerin; Joan Vila-Francés

doi:10.3390/app13106038

Applied Sciences (May 2023)

Comparison between Machine Learning and Deep Learning Approaches for the Detection of Toxic Comments on Social Networks

Andrea Bonetti,
Marcelino Martínez-Sober,
Julio C. Torres,
Jose M. Vega,
Sebastien Pellerin,
Joan Vila-Francés

Affiliations

Andrea Bonetti: Intelligent Data Analysis Laboratory (IDAL), Department of Electronic Engineering, ETSE (Engineering School), Universitat de València (UV), Av. Universitat, sn, 46100 Burjassot, Spain
Marcelino Martínez-Sober: Intelligent Data Analysis Laboratory (IDAL), Department of Electronic Engineering, ETSE (Engineering School), Universitat de València (UV), Av. Universitat, sn, 46100 Burjassot, Spain
Julio C. Torres: Allot Communications Spain SLU, C. José Echegaray, 8, 28232 Las Rozas de Madrid, Spain
Jose M. Vega: Allot Communications Spain SLU, C. José Echegaray, 8, 28232 Las Rozas de Madrid, Spain
Sebastien Pellerin: Allot Communications Spain SLU, C. José Echegaray, 8, 28232 Las Rozas de Madrid, Spain
Joan Vila-Francés: Intelligent Data Analysis Laboratory (IDAL), Department of Electronic Engineering, ETSE (Engineering School), Universitat de València (UV), Av. Universitat, sn, 46100 Burjassot, Spain

DOI: https://doi.org/10.3390/app13106038
Journal volume & issue: Vol. 13, no. 10
p. 6038

Abstract

Read online

The way we communicate has been revolutionised by the widespread use of social networks. Any kind of online message can reach anyone in the world almost instantly. The speed with which information spreads is undoubtedly the strength of social networks, but at the same time, any user of these platforms can see how toxic messages spread in parallel with likes, comments and ratings about any person or entity. In such cases, the victim feels even more helpless and defenceless as a result of the rapid spread. For this reason, we have implemented an automatic detector of toxic messages on social media. This allows us to stop toxicity in its tracks and protect victims. In particular, the aim of the survey is to demonstrate how traditional Machine Learning methods of Natural Language Processing (NLP) work on equal terms with Deep Learning methods represented by a Transformer architecture and characterised by a higher computational cost. In particular, the paper describes the results obtained by testing different supervised Machine Learning classifiers (Logistic Regression, Random Forest and Support Vector Machine) combined with two topic-modelling techniques of NLP, (Latent Semantic Analysis and Latent Dirichlet Allocation). A pre-trained Transformer named BERTweet was also tested. All models performed well in this task, so much so that values close to or above 90% were achieved in terms of the F1 score evaluation metric. The best result achieved by Transformer BERTweet, 91.40%, was therefore not impressive in this context, as the performance gains are too small compared to the computational overhead.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords