IEEE Access (Jan 2024)
BertSent: Transformer-Based Model for Sentiment Analysis of Penta-Class Tweet Classification
Abstract
Sentiment analysis (SA) is a popular method for obtaining relevant and subjective information from textual content. Sentiment analysis of multimedia material is helpful for various reasons but it is seen as challenging since the messages are often brief, unstructured, and contain linguistic inconsistencies. Previous research on sentiment analysis usually focused on dual or triple-class analysis while using older language modeling techniques. Furthermore, penta-class classification tasks have not been addressed as much. To deal with the challenge, we present a transformer-based model called BertSent that uses ordered preprocessing steps combined with transformer-based tokenization and optimization to get the best sentiment analysis results focused on dealing with limited data. Moreover, our framework handles the challenge of penta-class classification of tweets, and to that end, we combine many preprocessing techniques to fine-tune our transformer-based model. We employ resampling techniques to address class imbalance issues in the penta-class setup which improves model generalization and performance. For that purpose, we incorporate both over-sampling and under-sampling to tackle the challenge of class imbalance when dealing with the penta-class classification problem. Moreover, this article also compares the performance of the transformer-based model against a variety of deep learning-based models, including bi-directional models. The experimentations and results support our model’s remarkable performance considering the limited data and penta-class classification challenge. The results provide an interesting perspective as both under-sampling and oversampling provide similar results. BertSent model combined with over-sampling provides slightly better performance with 75.3% test accuracy in comparison to under-sampling which resulted in 75.1% accuracy.
Keywords