Компьютерные исследования и моделирование (Dec 2012)

Regularization, robustness and sparsity of probabilistic topic models

  • Konstantin Vyacheslavovich Vorontsov,
  • Anna Alexandrovna Potapenko

DOI
https://doi.org/10.20537/2076-7633-2012-4-4-693-706
Journal volume & issue
Vol. 4, no. 4
pp. 693 – 706

Abstract

Read online

We propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the proposed broad family of models. We propose the robust PLSA model and show that it is more sparse and performs better that regularized models like LDA.

Keywords