IEEE Access (Jan 2021)

A YouTube Spam Comments Detection Scheme Using Cascaded Ensemble Machine Learning Model

  • Hayoung Oh

DOI
https://doi.org/10.1109/access.2021.3121508
Journal volume & issue
Vol. 9
pp. 144121 – 144128

Abstract

Read online

This paper proposes a technique to detect spam comments on YouTube, which have recently seen tremendous growth. YouTube is running its own spam blocking system but continues to fail to block them properly. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naïve Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and two ensemble models (Ensemble with hard voting, Ensemble with soft voting) combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.

Keywords