Ain Shams Engineering Journal (Jun 2024)
Arabic sarcasm detection: An enhanced fine-tuned language model approach
Abstract
Sarcasm is a complex linguistic phenomenon involving humor, criticism, or phrases that convey the opposite meaning, mask true feelings, and play pivotal roles in various aspects of communication. Therefore, identifying sarcasm is essential for sentiment analysis, social media monitoring, and customer service, as it enables a better understanding of public sentiment. Moreover, social media has become a primary platform for people to express their feelings and opinions and provide feedback to businesses and service providers. Misinterpreting sarcasm in customer feedback can lead to incorrect responses and actions. However, accurately detecting sarcasm is challenging because it depends on context, cultural factors, and inherent ambiguity. Despite the plenty of research and resources in Machine Learning (ML) for detecting sarcasm in English, including Deep Learning (DL) techniques, there is still a shortage of research in sarcasm detection in Arabic, particularly in DL methodologies and available sarcastic datasets. This paper constructed a new Arabic sarcastic corpus and fine-tuned three pre-trained Arabic transformer-based Language Models (LM) for Arabic sarcasm detection. We also proposed a hybrid DL approach for sarcasm detection that combines static and contextualized representations using pre-trained LM, such as Word2Vec word embeddings and Bidirectional Encoder Representations from Transformers (BERT) models pretrained on Arabic resources. The proposed enhanced hybrid deep learning approach outperforms state-of-the-art models by 8% on a shared benchmark dataset and achieves a 5% improvement in F1-score on another.