IEEE Access (Jan 2019)

Microblog Hot Topics Detection Based on VSM and HMBTM Model Fusion

  • Qiu Liqing,
  • Jia Wei,
  • Liu Haiyan,
  • Fan Xin

DOI
https://doi.org/10.1109/ACCESS.2019.2932458
Journal volume & issue
Vol. 7
pp. 120273 – 120281

Abstract

Read online

With the rapid development of social media such as Twitter and Sina microblog, short texts are becoming more and more popular. However, because of the sparsity of word co-occurrence patterns in short texts, it is always a challenge to infer topics from the short texts. To solve that sparse problem of the short text data, in this paper, we present a HVCH (Hot topic detection based on the VSM Combined HMBTM) fusion model algorithm. Taking heat as the key factor, the heat matrix of calculated words and pairs is introduced to improve the BTM(Biterm Topic Model) model, and the semantic relationship between words is mined through the common heat between words, which is applied to the detection of hot topics in short text. Subsequently, we fuse the VSM(Vector Space Model) model with the HMBTM(Heat Matrix based BTM) model and uses the optimized Single-Pass clustering algorithm to obtain hot topics. At last, we conduct extensive experiments over real data sets, which demonstrate that our proposal achieves excellent performance to other related algorithms.

Keywords