Вестник КазНУ. Серия математика, механика, информатика (Nov 2017)

Automatic classification of reviews based on machine learning

  • K. Ch. Koybagarov,
  • M. Ye. Mansurova

Journal volume & issue
Vol. 91, no. 3
pp. 66 – 74

Abstract

Read online

Currently, there is strong interest in the problem of automatic analysis of reviews of Internet users on various issues. One of the main problems in the analysis of reviews is a tone classification of the texts. This article is about different approaches to the problem of tone classification in 3 classes using the machine learning methods on the example of three collections. The main objective that was set in this work is the comparison of different approaches to the text view within the frame of the vector model, several machine learning methods, and various combinations of statistical and linguistic features. To build the model of tone classification the follow set of statistical and linguistic features is identified: Building word vectors, accounting N -gramm, accounting emoticons, counting of exclamation and question marks, accounting parts of speech, replacing the long repetition of vowel to one vowel, accounting negations, accounting the review length. In this work we used the following machine learning methods: support vector machines, logistic regression and naive Bayesian classifier. The computing experiments were conducted with different variants of word vector models, N -grams and text description features. The experimental results allow us to make recommendations on the selection of the most effective features for tone classification.

Keywords