Machine learning methods for toxic comment classification: a systematic review

Andročec Darko

doi:10.2478/ausi-2020-0012

Acta Universitatis Sapientiae: Informatica (Dec 2020)

Machine learning methods for toxic comment classification: a systematic review

Andročec Darko

Affiliations

Andročec Darko: Faculty of Organization and Informatics, University of Zagreb, Pavlinska 2, 42000 Varaždin, Croatia

DOI: https://doi.org/10.2478/ausi-2020-0012
Journal volume & issue: Vol. 12, no. 2
pp. 205 – 216

Abstract

Read online

Nowadays users leave numerous comments on different social networks, news portals, and forums. Some of the comments are toxic or abusive. Due to numbers of comments, it is unfeasible to manually moderate them, so most of the systems use some kind of automatic discovery of toxicity using machine learning models. In this work, we performed a systematic review of the state-of-the-art in toxic comment classification using machine learning methods. We extracted data from 31 selected primary relevant studies. First, we have investigated when and where the papers were published and their maturity level. In our analysis of every primary study we investigated: data set used, evaluation metric, used machine learning methods, classes of toxicity, and comment language. We finish our work with comprehensive list of gaps in current research and suggestions for future research themes related to online toxic comment classification problem.

Published in Acta Universitatis Sapientiae: Informatica

ISSN: 2066-7760 (Online)
Publisher: Scientia Publishing House
Country of publisher: Romania
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://acta.sapientia.ro/en/series/informatica

About the journal

Abstract

Keywords