Dianzi Jishu Yingyong (Apr 2020)

K-means Weibo topic discovery model based on feature fusion

  • Li Hailei,
  • Yang Wenzhong,
  • Li Donghao,
  • Wen Jiebin,
  • Qian Yunyun

DOI
https://doi.org/10.16157/j.issn.0258-7998.191367
Journal volume & issue
Vol. 46, no. 4
pp. 24 – 28

Abstract

Read online

Aiming at the shortcomings of high-dimensional sparseness in the short text of Weibo on traditional topic detection methods, a K-means Weibo topic discovery model based on feature fusion was proposed. In order to better express the semantic information of Weibo topics in this paper, the word-pair vector model(Biterm_VSM) co-occurring in sentences is used instead of the traditional vector space model(VSM), and combined with the topic model(Latent Dirichlet Allocation,LDA) to mine the potential semantics of Weibo short text, merging features obtained from the two models, and applying K-means clustering algorithm to discover topics. The Experimental results show that compared with the traditional topic detection method, the model′s adjusted Rand index(ARI) is 0.80, which is 3%~6% higher than the traditional topic detection method.

Keywords