Statistical Theory and Related Fields (Jan 2020)

Topic model for graph mining based on hierarchical Dirichlet process

  • Haibin Zhang,
  • Shang Huating,
  • Xianyi Wu

DOI
https://doi.org/10.1080/24754269.2019.1593098
Journal volume & issue
Vol. 4, no. 1
pp. 66 – 77

Abstract

Read online

In this paper, a nonparametric Bayesian graph topic model (GTM) based on hierarchical Dirichlet process (HDP) is proposed. The HDP makes the number of topics selected flexibly, which breaks the limitation that the number of topics need to be given in advance. Moreover, the GTM releases the assumption of ‘bag of words’ and considers the graph structure of the text. The combination of HDP and GTM takes advantage of both which is named as HDP–GTM. The variational inference algorithm is used for the posterior inference and the convergence of the algorithm is analysed. We apply the proposed model in text categorisation, comparing to three related topic models, latent Dirichlet allocation (LDA), GTM and HDP.

Keywords