CAAI Transactions on Intelligence Technology (Jun 2022)

Bayesian estimation‐based sentiment word embedding model for sentiment analysis

  • Jingyao Tang,
  • Yun Xue,
  • Ziwen Wang,
  • Shaoyang Hu,
  • Tao Gong,
  • Yinong Chen,
  • Haoliang Zhao,
  • Luwei Xiao

DOI
https://doi.org/10.1049/cit2.12037
Journal volume & issue
Vol. 7, no. 2
pp. 144 – 155

Abstract

Read online

Abstract Sentiment word embedding has been extensively studied and used in sentiment analysis tasks. However, most existing models have failed to differentiate high‐frequency and low‐frequency words. Accordingly, the sentiment information of low‐frequency words is insufficiently captured, thus resulting in inaccurate sentiment word embedding and degradation of overall performance of sentiment analysis. A Bayesian estimation‐based sentiment word embedding (BESWE) model, which aims to precisely extract the sentiment information of low‐frequency words, has been proposed. In the model, a Bayesian estimator is constructed based on the co‐occurrence probabilities and sentiment probabilities of words, and a novel loss function is defined for sentiment word embedding learning. The experimental results based on the sentiment lexicons and Movie Review dataset show that BESWE outperforms many state‐of‐the‐art methods, for example, C&W, CBOW, GloVe, SE‐HyRank and DLJT1, in sentiment analysis tasks, which demonstrate that Bayesian estimation can effectively capture the sentiment information of low‐frequency words and integrate the sentiment information into the word embedding through the loss function. In addition, replacing the embedding of low‐frequency words in the state‐of‐the‐art methods with BESWE can significantly improve the performance of those methods in sentiment analysis tasks.

Keywords