PeerJ Computer Science (Aug 2022)

Sentiment analysis techniques, challenges, and opportunities: Urdu language-based analytical study

  • Muhammad Irzam Liaqat,
  • Muhammad Awais Hassan,
  • Muhammad Shoaib,
  • Syed Khaldoon Khurshid,
  • Mohamed A. Shamseldin

DOI
https://doi.org/10.7717/peerj-cs.1032
Journal volume & issue
Vol. 8
p. e1032

Abstract

Read online Read online

Sentiment analysis in research involves the processing and analysis of sentiments from textual data. The sentiment analysis for high resource languages such as English and French has been carried out effectively in the past. However, its applications are comparatively few for resource-poor languages due to a lack of textual resources. This systematic literature explores different aspects of Urdu-based sentiment analysis, a classic case of poor resource language. While Urdu is a South Asian language understood by one hundred and sixty-nine million people across the planet. There are various shortcomings in the literature, including limitation of large corpora, language parsers, and lack of pre-trained machine learning models that result in poor performance. This article has analyzed and evaluated studies addressing machine learning-based Urdu sentiment analysis. After searching and filtering, forty articles have been inspected. Research objectives have been proposed that lead to research questions. Our searches were organized in digital repositories after selecting and screening relevant studies. Data was extracted from these studies. Our work on the existing literature reflects that sentiment classification performance can be improved by overcoming the challenges such as word sense disambiguation and massive datasets. Furthermore, Urdu-based language constructs, including language parsers and emoticons, context-level sentiment analysis techniques, pre-processing methods, and lexical resources, can also be improved.

Keywords