Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM)

Arief Setyanto; Arif Laksito; Fawaz Alarfaj; Mohammed Alreshoodi; Kusrini; Irwan Oyong; Mardhiya Hayaty; Abdullah Alomair; Naif Almusallam; Lilis Kurniasari

doi:10.3390/app12094140

Applied Sciences (Apr 2022)

Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM)

Arief Setyanto,
Arif Laksito,
Fawaz Alarfaj,
Mohammed Alreshoodi,
Kusrini,
Irwan Oyong,
Mardhiya Hayaty,
Abdullah Alomair,
Naif Almusallam,
Lilis Kurniasari

Affiliations

Arief Setyanto: Magister of Informatics Engineering, Universitas Amikom Yogyakarta, Yogyakarta 55281, Indonesia
Arif Laksito: Faculty of Computer Science, Universitas Amikom Yogyakarta, Yogyakarta 55281, Indonesia
Fawaz Alarfaj: Department of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia
Mohammed Alreshoodi: Department of Natural Applied Science, Applied College, Qassim University, Buraydah 52571, Saudi Arabia
Kusrini: Magister of Informatics Engineering, Universitas Amikom Yogyakarta, Yogyakarta 55281, Indonesia
Irwan Oyong: Faculty of Computer Science, Universitas Amikom Yogyakarta, Yogyakarta 55281, Indonesia
Mardhiya Hayaty: Faculty of Computer Science, Universitas Amikom Yogyakarta, Yogyakarta 55281, Indonesia
Abdullah Alomair: Department of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia
Naif Almusallam: Department of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia
Lilis Kurniasari: Departemen of Electrical Engineering, Universitas Nahdlatul Ulama Yogyakarta, Yogyakarta 55162, Indonesia

DOI: https://doi.org/10.3390/app12094140
Journal volume & issue: Vol. 12, no. 9
p. 4140

Abstract

Read online

Arabic is one of the official languages recognized by the United Nations (UN) and is widely used in the middle east, and parts of Asia, Africa, and other countries. Social media activity currently dominates the textual communication on the Internet and potentially represents people’s views about specific issues. Opinion mining is an important task for understanding public opinion polarity towards an issue. Understanding public opinion leads to better decisions in many fields, such as public services and business. Language background plays a vital role in understanding opinion polarity. Variation is not only due to the vocabulary but also cultural background. The sentence is a time series signal; therefore, sequence gives a significant correlation to the meaning of the text. A recurrent neural network (RNN) is a variant of deep learning where the sequence is considered. Long short-term memory (LSTM) is an implementation of RNN with a particular gate to keep or ignore specific word signals during a sequence of inputs. Text is unstructured data, and it cannot be processed further by a machine unless an algorithm transforms the representation into a readable machine learning format as a vector of numerical values. Transformation algorithms range from the Term Frequency–Inverse Document Frequency (TF-IDF) transform to advanced word embedding. Word embedding methods include GloVe, word2vec, BERT, and fastText. This research experimented with those algorithms to perform vector transformation of the Arabic text dataset. This study implements and compares the GloVe and fastText word embedding algorithms and long short-term memory (LSTM) implemented in single-, double-, and triple-layer architectures. Finally, this research compares their accuracy for opinion mining on an Arabic dataset. It evaluates the proposed algorithm with the ASAD dataset of 55,000 annotated tweets in three classes. The dataset was augmented to achieve equal proportions of positive, negative, and neutral classes. According to the evaluation results, the triple-layer LSTM with fastText word embedding achieved the best testing accuracy, at 90.9%, surpassing all other experimental scenarios.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords