IEEE Access (Jan 2021)
Sentiment Analysis Using Stacked Gated Recurrent Unit for Arabic Tweets
Abstract
Over the last decade, the amount of Arabic content created on websites and social media has grown significantly. Opinions are shared openly and freely on social media, a process that provides a rich source for trend analyses. These analyses could be accomplished artificially by natural language processing tasks, such as sentiment analysis. Those tasks are implemented initially using machine learning. Due to its accuracy in studying unstructured data, deep learning has been increasingly used as well. The gated recurrent unit (GRU) is a promising approach in textual analysis and exhibits large morphological variations. We propose two neural models, i.e., the stacked gated recurrent unit (SGRU) and stacked bidirectional gated recurrent unit (SBi-GRU), with word embedding to mine Arabic opinions. We also propose a new way of discarding stop words using automatic sentiment refinement (ASR) instead of using manually collected stop words or using low quality available Arabic stop words’ lists. The performance of our proposed models is compared with that of long short-term memory (LSTM), the support vector machine (SVM), and the most recent pretrained Arabic bidirectional encoder representations from transformers (AraBERT). In addition, we compare our models’ performance to that of an ensemble architecture of the abovementioned models to find the best model architecture for Arabic natural language processing (NLP). To the best of our knowledge, no previous studies have applied either the unidirectional or bidirectional SGRU for Arabic sentiment classification. Furthermore, no ensemble models have been implemented from these architectures for the Arabic language. The results show that the six-layer SGRU stacking and five-layer SBi-GRU stacking achieve the highest accuracy and that the ensemble method outperforms all other models, with an accuracy exceeding 90%.
Keywords