IEEE Access (Jan 2019)

ULW-DMM: An Effective Topic Modeling Method for Microblog Short Text

  • Jia Yu,
  • Lirong Qiu

DOI
https://doi.org/10.1109/ACCESS.2018.2885987
Journal volume & issue
Vol. 7
pp. 884 – 893

Abstract

Read online

With the popularity of social media, including micro-blog, mining effective information in short texts has become an increasingly important issue. However, due to the sparseness, high dimensionality and large amount of data, mining this information is a very challenging task. In this paper, we propose a method to extend the Dirichlet multinomial mixture (DMM) topic model by combining the user-LDA topic model based on internal data expansion with the potential feature vector representation of words trained on a very large external corpus (we refer to it as ULW-DMM). The experimental results show that the ULW-DMM model produces a relatively large improvement in topic consistency and classification tasks for topic modeling of microblog short texts.

Keywords