IEEE Access (Jan 2022)
Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
Abstract
The exponential growth of the internet and a multi-fold increase in social media users in the last decade have resulted in a massive growth of unstructured data. Aspect-Based Sentiment Analysis (ABSA) is challenging because it performs a fine-grain analysis; it is a text analysis technique where the opinions group is based on the aspect. The Aspect Extraction (AE) task is one of the core subtasks of ABSA; it helps to identify aspect terms in the text, comments, or reviews. The challenge of the Arabic AE task increases due to the complexity of the Arabic language. This work aims to develop the Arabic AE task by proposing transfer learning using state-of-art pre-trained contextual language models. We concatenate the Bidirectional Encoder Representation from Transformers (BERT) language model and contextualize string embeddings (Flair embedding) as a stacked embeddings layer for better word representation for Arabic language. Then, we extend it with different deep learning network architectures. For Arabic AE, the model is developed by concatenating the Arabic contextual language model, AraBERT, and Flair embedding as a contextual stacked embeddings layer with an extended layer, BiLSTM-CRF or BiGRU-CRF, for sequence labeling. Our proposed models are called BF-BiLSTM-CRF and BF-BiGRU-CRF. The proposed model is evaluated using the Arabic Hotel’s reviews dataset. For performance evaluation, we used the F1 score. The experimental results show that the proposed BF-BiLSTM-CRF configuration outperformed the baseline and other models by achieving an F1score of 79.7%.
Keywords