Fine-Tuning BERT for Multi-Label Sentiment Analysis in Unbalanced Code-Switching Text

Tiancheng Tang; Xinhuai Tang; Tianyi Yuan

doi:10.1109/access.2020.3030468

IEEE Access (Jan 2020)

Fine-Tuning BERT for Multi-Label Sentiment Analysis in Unbalanced Code-Switching Text

Tiancheng Tang,
Xinhuai Tang,
Tianyi Yuan

Affiliations

Tiancheng Tang: ORCiD; School of Software, Shanghai Jiao Tong University, Shanghai, China
Xinhuai Tang: School of Software, Shanghai Jiao Tong University, Shanghai, China
Tianyi Yuan: School of Software, Shanghai Jiao Tong University, Shanghai, China

DOI: https://doi.org/10.1109/access.2020.3030468
Journal volume & issue: Vol. 8
pp. 193248 – 193256

Abstract

Read online

Previous research on sentiment analysis mainly focuses on binary or ternary sentiment analysis in monolingual texts. However, in today's social media such as micro-blogs, emotions are often expressed in bilingual or multilingual text called code-switching text, and people's emotions are complex, including happiness, sadness, angry, afraid, surprise, etc. Different emotions may exist together, and the proportion of each emotion in the code-switching text is often unbalanced. Inspired by the recently proposed BERT model, we investigate how to fine-tune BERT for multi-label sentiment analysis in code-switching text in this paper. Our investigation includes the selection of pre-trained models and the fine-tuning methods of BERT on this task. To deal with the problem of the unbalanced distribution of emotions, a method based on data augmentation, undersampling and ensemble learning is proposed to get balanced samples and train different multi-label BERT classifiers. Our model combines the prediction of each classifier to get the final outputs. The experiment on the dataset of NLPCC 2018 shared task 1 shows the effectiveness of our model for the unbalanced code-switching text. The F1-Score of our model is higher than many previous models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords