Jisuanji kexue (Feb 2022)

Improved Topic Sentiment Model with Word Embedding Based on Gaussian Distribution

  • LI Yu-qiang, ZHANG Wei-jiang, HUANG Yu, LI Lin, LIU Ai-hua

DOI
https://doi.org/10.11896/jsjkx.201200082
Journal volume & issue
Vol. 49, no. 2
pp. 256 – 264

Abstract

Read online

In recent years,the topic sentiment model as an important research in the field of unsupervised learning,has been used in text topic mining and sentiment analysis.However,Weibo has brought some challenges to the topic sentiment model because of its short text and in complete structure.Therefore,the related research and improvement work of this paper will be carried out around the topic sentiment model of Weibo.We introduce the word vector technology to the popular model-TSMMF(topic sentiment model based on multi-feature fusion),use multivariate Gaussian distribution to sample neighboring words fast from the word embedding space,and replace the words generated by the Dirichlet multinomial distribution.Thus,the words with lowcooccurrence frequency and less information will be transformed into words with prominent topic and clear information.At the same time,the nearest neighbor search algorithm is used to further improve the running speed of the model when processing large-scale Weibo corpus,and then the GWE-TSMMF model is proposed.The experimental results show that the average F1 value of GWE-TSMMF model is about 0.718.The sentiment polarity analysis is better than the original model and the existing mainstream word embedding topic sentiment models (WS-TSWE and HST-SCW).

Keywords