Sentiment analysis techniques, challenges, and opportunities: Urdu language-based analytical study

Muhammad Irzam Liaqat; Muhammad Awais Hassan; Muhammad Shoaib; Syed Khaldoon Khurshid; Mohamed A. Shamseldin

doi:10.7717/peerj-cs.1032

PeerJ Computer Science (Aug 2022)

Sentiment analysis techniques, challenges, and opportunities: Urdu language-based analytical study

Muhammad Irzam Liaqat,
Muhammad Awais Hassan,
Muhammad Shoaib,
Syed Khaldoon Khurshid,
Mohamed A. Shamseldin

Affiliations

Muhammad Irzam Liaqat: Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan
Muhammad Awais Hassan: Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan
Muhammad Shoaib: Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan
Syed Khaldoon Khurshid: Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan
Mohamed A. Shamseldin: Dept. of Mechanical Engineering, Faculty of Engineering Technology, Future University in Egypt, New Cairo, Eygpt

DOI: https://doi.org/10.7717/peerj-cs.1032
Journal volume & issue: Vol. 8
p. e1032

Abstract

Read online Read online

Sentiment analysis in research involves the processing and analysis of sentiments from textual data. The sentiment analysis for high resource languages such as English and French has been carried out effectively in the past. However, its applications are comparatively few for resource-poor languages due to a lack of textual resources. This systematic literature explores different aspects of Urdu-based sentiment analysis, a classic case of poor resource language. While Urdu is a South Asian language understood by one hundred and sixty-nine million people across the planet. There are various shortcomings in the literature, including limitation of large corpora, language parsers, and lack of pre-trained machine learning models that result in poor performance. This article has analyzed and evaluated studies addressing machine learning-based Urdu sentiment analysis. After searching and filtering, forty articles have been inspected. Research objectives have been proposed that lead to research questions. Our searches were organized in digital repositories after selecting and screening relevant studies. Data was extracted from these studies. Our work on the existing literature reflects that sentiment classification performance can be improved by overcoming the challenges such as word sense disambiguation and massive datasets. Furthermore, Urdu-based language constructs, including language parsers and emoticons, context-level sentiment analysis techniques, pre-processing methods, and lexical resources, can also be improved.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords