Natural Language Processing Journal (Sep 2024)
Exploring transformer models in the sentiment analysis task for the under-resource Bengali language
Abstract
In the sentiment analysis (SA) task, we can obtain a positive or negative-typed comment or feedback from an online user or a customer about any object, such as a movie, drama, food, and others. This user’s sentiment may positively impact various decision-making processes. In this regard, a lot of studies have been done on identifying sentiments from a text in high-resource languages like English. However, a small number of studies are detected in the under-resource Bengali language because of the unavailability of the benchmark corpus, limitations of text processing application software, and so on. Furthermore, there is still enough space to enhance the classification performance of the SA task. In this research, we experiment on a recognized Bengali dataset of 11,807 comments to find positive or negative sentiments. We employ five state-of-the-art transformer-based pretrained models, such as multilingual Bidirectional Encoder Representations from Transformers (mBERT), BanglaBERT, Bangla-Bert-Base, DistilmBERT, and XLM-RoBERTa-base (XLM-R-base), with tuning of the hyperparameters. After that, we propose a combined model named Transformer-ensemble that presents outstanding detection performance with an accuracy of 95.97% and an F1-score of 95.96% compared to the existing recent methods in the Bengali SA task.