Jisuanji kexue (Sep 2021)

Public Opinion Sentiment Big Data Analysis Ensemble Method Based on Spark

  • DAI Hong-liang, ZHONG Guo-jin, YOU Zhi-ming , DAI Hong-ming

DOI
https://doi.org/10.11896/jsjkx.210400280
Journal volume & issue
Vol. 48, no. 9
pp. 118 – 124

Abstract

Read online

With the development of mobile Internet technology,social media has become the main approach for the public to share views and express their emotions.Sentiment analysis for social media texts in major social events can effectively monitor public opinion.In order to solve the problem of low accuracy and efficiency of existing Chinese social media sentiment analysis algorithms,an ensemble sentiment analysis big data method(S-FWS) based on Spark distributed system is proposed.Firstly,the new words are found by calculating the PMI association degree after pre-segmentation by Jieba library.Then,the text features are extracted by considering the importance of words and feature selection is realized by Lasso.Finally,in order to improve the traditional Stacking framework neglecting the feature importance,the accuracy information of the primary learners is used to weight the probabilistic features,and the polynomial features are constructed to train the secondary learner.A variety of algorithms are introduced in the stand-alone mode and the Spark platform receptively to carry out comparative experiments.Results show that the S-FWS method proposed in this paper has certain advantages in accuracy and time consumption;distributed system can greatly improve the operating efficiency of the algorithms,and with the increase of working nodes,the time consumption of the algorithms gradually decreases.

Keywords