A transformer-based generative adversarial learning to detect sarcasm from Bengali text with correct classification of confusing text

Sanzana Karim Lora; Ishrat Jahan; Rahad Hussain; Rifat Shahriyar; A.B.M. Alim Al Islam

Heliyon (Dec 2023)

A transformer-based generative adversarial learning to detect sarcasm from Bengali text with correct classification of confusing text

Sanzana Karim Lora,
Ishrat Jahan,
Rahad Hussain,
Rifat Shahriyar,
A.B.M. Alim Al Islam

Affiliations

Sanzana Karim Lora: Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh; Corresponding author.
Ishrat Jahan: Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
Rahad Hussain: Department of Bengali Language and Literature, Brac University, Dhaka, Bangladesh
Rifat Shahriyar: Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
A.B.M. Alim Al Islam: Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh

Journal volume & issue: Vol. 9, no. 12
p. e22531

Abstract

Read online

Sarcasm detection research in Bengali is still limited due to a lack of relevant resources. In this context, getting high-quality annotated data is costly and time-consuming. Therefore, in this paper, we present a transformer-based generative adversarial learning for sarcasm detection from Bengali text based on available limited labeled data. Here, we use the Bengali sarcasm dataset ‘Ben-Sarc’. Besides, we construct another dataset containing Bengali sarcastic and non-sarcastic comments from YouTube and newspapers to observe the model's performance on the new dataset. On top of that, we utilize another Bengali sarcasm dataset ‘BanglaSarc’ to further prove our models' robustness. Among all models, the Bangla BERT-based Generative Adversarial Model has achieved the highest accuracy with 77.1% for the ‘Ben-Sarc’ dataset. Besides, this model has achieved the highest accuracy of 68.2% for the dataset constructed from YouTube and newspaper, and 97.2% for the ‘BanglaSarc’ dataset.

Published in Heliyon

ISSN: 2405-8440 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General); Social Sciences: Social sciences (General)
Website: https://www.cell.com/heliyon/home

About the journal

Abstract

Keywords