Bayesian estimation‐based sentiment word embedding model for sentiment analysis

Jingyao Tang; Yun Xue; Ziwen Wang; Shaoyang Hu; Tao Gong; Yinong Chen; Haoliang Zhao; Luwei Xiao

doi:10.1049/cit2.12037

CAAI Transactions on Intelligence Technology (Jun 2022)

Bayesian estimation‐based sentiment word embedding model for sentiment analysis

Jingyao Tang,
Yun Xue,
Ziwen Wang,
Shaoyang Hu,
Tao Gong,
Yinong Chen,
Haoliang Zhao,
Luwei Xiao

Affiliations

Jingyao Tang: Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials School of Physics and Telecommunication Engineering South China Normal University Guangzhou China
Yun Xue: Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials School of Physics and Telecommunication Engineering South China Normal University Guangzhou China
Ziwen Wang: Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials School of Physics and Telecommunication Engineering South China Normal University Guangzhou China
Shaoyang Hu: College of Mathematics and Informatics & College of Software Engineering South China Agricultural University Guangzhou China
Tao Gong: School of Foreign Languages Zhejiang University of Finance & Economics Hangzhou Zhejiang China
Yinong Chen: School of Computing Informatics and Decision Systems Engineering Arizona State University Tempe USA
Haoliang Zhao: Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials School of Physics and Telecommunication Engineering South China Normal University Guangzhou China
Luwei Xiao: Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials School of Physics and Telecommunication Engineering South China Normal University Guangzhou China

DOI: https://doi.org/10.1049/cit2.12037
Journal volume & issue: Vol. 7, no. 2
pp. 144 – 155

Abstract

Read online

Abstract Sentiment word embedding has been extensively studied and used in sentiment analysis tasks. However, most existing models have failed to differentiate high‐frequency and low‐frequency words. Accordingly, the sentiment information of low‐frequency words is insufficiently captured, thus resulting in inaccurate sentiment word embedding and degradation of overall performance of sentiment analysis. A Bayesian estimation‐based sentiment word embedding (BESWE) model, which aims to precisely extract the sentiment information of low‐frequency words, has been proposed. In the model, a Bayesian estimator is constructed based on the co‐occurrence probabilities and sentiment probabilities of words, and a novel loss function is defined for sentiment word embedding learning. The experimental results based on the sentiment lexicons and Movie Review dataset show that BESWE outperforms many state‐of‐the‐art methods, for example, C&W, CBOW, GloVe, SE‐HyRank and DLJT1, in sentiment analysis tasks, which demonstrate that Bayesian estimation can effectively capture the sentiment information of low‐frequency words and integrate the sentiment information into the word embedding through the loss function. In addition, replacing the embedding of low‐frequency words in the state‐of‐the‐art methods with BESWE can significantly improve the performance of those methods in sentiment analysis tasks.

Published in CAAI Transactions on Intelligence Technology

ISSN: 2468-2322 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/24682322

About the journal

Abstract

Keywords