Sistemas de Informação (Jul 2018)
Bullying Detection: How to automatically identify this practice in social networks?
Abstract
Machine learning techniques can be used to automatically infer information that is not available from large volumes of data. Social networks have been gaining popularity and turning into an important sources of data for an application of computer techniques. The goal of this work is to study how well preprocessing techniques, feature sets and classifiers work on the task of automatic bullying trace detection, as well as the role of the author of the text on the reported episode.We focused on social networks texts written in Brazilian Portuguese. Several different classifiers and attribute sets were studied and compared, in order to identify which one is the most appropriate for this task. Among all tested configurations, the best results was found when using the largest trainning set transformed into a feature set made of unigrams and bigrams in conjunction with SVM with an RBF kernel