Journal of ICT Research and Applications (Apr 2023)

Sentiment Classification for Film Reviews in Gujarati Text Using Machine Learning and Sentiment Lexicons

  • Parita Shah,
  • Priya Swaminarayan,
  • Maitri Patel

DOI
https://doi.org/10.5614/itbj.ict.res.appl.2023.17.1.1
Journal volume & issue
Vol. 17, no. 1

Abstract

Read online

In this paper, two techniques for sentiment classification are proposed: Gujarati Lexicon Sentiment Analysis (GLSA) and Gujarati Machine Learning Sentiment Analysis (GMLSA) for sentiment classification of Gujarati text film reviews. Five different datasets were produced to validate the machine learning-based and lexicon-based methods’ accuracy. The lexicon-based approach employs a sentiment lexicon known as GujSentiWordNet, which identifies sentiments with a sentiment score for feature generation, while in the machine learning-based approach, five classifiers are used: logistic regression (LR), random forest (RF), k-nearest neighbors (KNN), support vector machine (SVM), naive Bayes (NB) with TF-IDF, and count vectorizer for feature selection. Experiments were carried out and the results obtained were compared using accuracy, precision, recall, and F-score as performance evaluation criteria. According to the test results, the machine learning-based technique improved accuracy by 3 to 10% on average when compared to the lexicon-based approach.

Keywords