IEEE Access (Jan 2022)

Sentiment Analysis of Reviews in Natural Language: Roman Urdu as a Case Study

  • Muhammad Aasim Qureshi,
  • Muhammad Asif,
  • Mohd Fadzil Hassan,
  • Adnan Abid,
  • Asad Kamal,
  • Sohail Safdar,
  • Rehan Akbar

DOI
https://doi.org/10.1109/ACCESS.2022.3150172
Journal volume & issue
Vol. 10
pp. 24945 – 24954

Abstract

Read online

Opinion Mining from user reviews is an emerging field. Sentiment Analysis of Natural Language text helps us in finding the opinion of the customers. These reviews can be in any language e.g. English, Chinese, Arabic, Japanese, Urdu, and Hindi. This research presents a model to classify the polarity of the review(s) in Roman Urdu text (reviews). For the purpose, raw data was scraped from the reviews of 20 songs from Indo-Pak Music Industry. In this research a new dataset of 24000 reviews of Roman Urdu text is created. Nine Machine Learning algorithms—Naïve Bayes, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Artificial Neural Networks, Convolutional Neural Network, Recurrent Neural Networks, ID3 and Gradient Boost Tree, are attempted. Logistic Regression outperformed the rest, based on testing and cross validation accuracies that are 92.25% and 91.47% respectively.

Keywords