IEEE Access (Jan 2025)
Improving Multi-Label Emotion Classification on Imbalanced Social Media Data With BERT and Clipped Asymmetric Loss
Abstract
This research addresses the challenge of multi-label emotion classification on imbalanced datasets using a BERT-based model. Emotion classification, essential for applications like social media analysis and sentiment monitoring, often suffers from class imbalance, which hinders the detection of rare emotions. To address this, our model incorporates a clipped asymmetric loss function to prioritize minority classes while mitigating the dominance of frequent classes. We conducted extensive experimentation on GoEmotions and SemEval-2018 Task 1C datasets to demonstrate the model’s effectiveness in achieving improved precision, recall, and F1-scores across various taxonomies, including GoEmotions, Ekman, and sentiment-grouped levels. Our approach achieved a notable improvement in macro-average F1-scores, increasing from 0.46 (baseline) to 0.54 on the GoEmotions dataset and 0.59 on the SemEval-2018 dataset. The results indicate significant advancements over standard BERT implementations and state-of-the-art models, particularly in recognizing rare emotions, making the model a robust solution for real-world, multi-label emotion classification tasks under imbalanced settings.
Keywords