ArabicDialects: An Efficient Framework for Arabic Dialects Opinion Mining on Twitter Using Optimized Deep Neural Networks

Diaa Salama Abdelminaam; Nabil Neggaz; Ibrahim Abd Elatif Gomaa; Fatma Helmy Ismail; Ahmed A. Elsawy

doi:10.1109/ACCESS.2021.3094173

IEEE Access (Jan 2021)

ArabicDialects: An Efficient Framework for Arabic Dialects Opinion Mining on Twitter Using Optimized Deep Neural Networks

Diaa Salama Abdelminaam,
Nabil Neggaz,
Ibrahim Abd Elatif Gomaa,
Fatma Helmy Ismail,
Ahmed A. Elsawy

Affiliations

Diaa Salama Abdelminaam: ORCiD; Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
Nabil Neggaz: ORCiD; Département d’informatique, Laboratoire Signal Image Parole (SIMPA), Faculté des Mathématiques et Informatique, Université des Sciences at de la Technologie d’Oran Mohamed Boudiaf (USTO-MB), Oran, Algérie
Ibrahim Abd Elatif Gomaa: Computer Science Department, Al-Obour High Institute for Management and Informatics, Cairo, Egypt
Fatma Helmy Ismail: ORCiD; Faculty of Computer Science, Misr International University, Cairo, Egypt
Ahmed A. Elsawy: Computer Science Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt

DOI: https://doi.org/10.1109/ACCESS.2021.3094173
Journal volume & issue: Vol. 9
pp. 97079 – 97099

Abstract

Read online

The rapid development of tools for communication such as social networks, tweeting and Whatsapp has generated a large mass of important textual data. Also, the COVID-19 pandemic has inflamed social networks, hence the automatic analysis of opinions has become paramount. The purpose of this paper is to analyze Arabic tweets in terms of positivity, negativity, or neutrality.In analyzing the opinions of the Arabic language, a real challenge is encountered, which lies in the use of different dialects (Egyptian, Saudian, Maghrebian, Gulfian, Levantine, Syrian $\ldots $ ). In this paper, we introduce two major components: The first employs six machine learning (ML) methods, including Decision Trees (DT), Logistic Regression (LR), k Nearest Neighbors (K-NN), Random Forests (RF), Support Vector Machines (SVM), and Nave Bayes (NB), with the TF-IDF method acting as the feature extraction.While, the second part consists of testing three variants of Deep Learning (DL) based on multiplicative Long Short Term Memory (mLSTM), Long Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) by applying word embedding as the input vector. The experimental study was validated using three Arabic language corpora (TEAD, ATSAD, and ASTD) and two learning modes (Hold out and 10-folds cross validation). The obtained results in terms of Accuracy (ACC), Precesion (PREC), Recall (REC), and F1-score (F1) show a clear performance for DL techniques based on a 10-folds strategy compared to the state-of-the-art. The experiments shown in the paper reveal that the proposed DL models accomplished the best results.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords