IEEE Access (Jan 2021)

A Novel Emotion Lexicon for Chinese Emotional Expression Analysis on Weibo: Using Grounded Theory and Semi-Automatic Methods

  • Liang Xu,
  • Linjian Li,
  • Zehua Jiang,
  • Zaoyi Sun,
  • Xin Wen,
  • Jiaming Shi,
  • Rui Sun,
  • Xiuying Qian

DOI
https://doi.org/10.1109/ACCESS.2020.3009292
Journal volume & issue
Vol. 9
pp. 92757 – 92768

Abstract

Read online

As one of the most popular social media platforms in China, Weibo has aggregated huge numbers of texts containing people’s thoughts, feelings, and experiences. Analyzing emotions expressed on Weibo has attracted a great deal of academic attention. Emotion lexicon is a vital foundation of sentiment analysis, but the existing lexicons still have defects such as a limited variety of emotions, poor cross-scenario adaptability, and confusing written and online expressions and words. By combining grounded theory and semi-automatic methods, we built a Weibo-based emotion lexicon for sentiment analysis. We first took a bottom-up approach to derive a theoretical model for emotions expressed on Weibo, and the substantive coding led to eight core emotion categories: joy, expectation, love, anger, anxiety, disgust, sadness, and surprise. Second, we built a new emotion lexicon containing 2,964 words by manually selecting seed words, constructing a word vector model to expand words, and making rules to filter words. Finally, we tested the effectiveness of our lexicon by using a lexicon-based approach to recognize the emotions expressed in Weibo text. The results showed that our lexicon performed better in Weibo emotion recognition than five other Chinese emotion lexicons. This study proposed a method to construct an emotion lexicon that considered both theory and application by combining qualitative research and artificial intelligence methods. Our work also provided a reference for future research in the field of social media sentiment analysis.

Keywords