Improving sentiment analysis accuracy with emoji embedding

Chuchu Liu; Fan Fang; Xu Lin; Tie Cai; Xu Tan; Jianguo Liu; Xin Lu

Journal of Safety Science and Resilience (Dec 2021)

Improving sentiment analysis accuracy with emoji embedding

Chuchu Liu,
Fan Fang,
Xu Lin,
Tie Cai,
Xu Tan,
Jianguo Liu,
Xin Lu

Affiliations

Chuchu Liu: College of Systems Engineering, National University of Defense Technology, Changsha, 410073, China
Fan Fang: College of Systems Engineering, National University of Defense Technology, Changsha, 410073, China
Xu Lin: College of Computer, National University of Defense Technology, Changsha, 410073, China
Tie Cai: School of Software Engineering, Shenzhen Institute of Information Technology, Shenzhen, 518172, China
Xu Tan: School of Software Engineering, Shenzhen Institute of Information Technology, Shenzhen, 518172, China
Jianguo Liu: Institute of Accounting and Finance, Shanghai University of Finance and Economics, Shanghai, 200433, China
Xin Lu: College of Systems Engineering, National University of Defense Technology, Changsha, 410073, China; Corresponding author

Journal volume & issue: Vol. 2, no. 4
pp. 246 – 252

Abstract

Read online

Due to the diversity and variability of Chinese syntax and semantics, accurately identifying and distinguishing individual emotions from online texts is challenging. To overcome this limitation, we incorporate a new source of individual sentiment, emojis, which contain thousands of graphic symbols and are increasingly being used for expressing emotion in online conversations. We examined popular sentiment analysis algorithms, including rule-based and classification algorithms, to evaluate the impact of supplementing emojis as additional features to improve the algorithm performance. Emojis were also translated into corresponding sentiment words when constructing features for comparison with those directly generated from emoji label words. In addition, considering different functions of emojis in texts, we classified all posts in the dataset by their emoji usage and examined the changes in algorithm performance. We found that emojis are effective as expanding features for improving the accuracy of sentiment analysis algorithms, and the algorithm performance can be further increased by taking different emoji usages into consideration. In this study, we developed an improved emoji-embedding model based on Bi-LSTM (namely, CEmo-LSTM), which achieves the highest accuracy (around 0.95) when analyzing online Chinese texts. We applied the CEmo-LSTM algorithm to a large dataset collected from Weibo from December 1, 2019 to March 20, 2020 to understand the sentiment evolution of online users during the COVID-19 pandemic. We found that the pandemic remarkably impacted individual sentiments and caused more passive emotions (e.g., horror and sadness). Our novel emoji-embedding algorithm creatively combined emojis as well as emoji usage with the sentiment analysis model and can handle emotion mining tasks more effectively and efficiently.

Published in Journal of Safety Science and Resilience

ISSN: 2666-4496 (Online)
Publisher: KeAi Communications Co., Ltd.
Country of publisher: China
LCC subjects: Social Sciences: Industries. Land use. Labor: Management. Industrial management: Risk in industry. Risk management
Website: https://www.keaipublishing.com/en/journals/journal-of-safety-science-and-resilience/

About the journal

Abstract

Keywords