PLoS ONE (Jan 2017)

Comparison of Criteria for Choosing the Number of Classes in Bayesian Finite Mixture Models.

  • Kazem Nasserinejad,
  • Joost van Rosmalen,
  • Wim de Kort,
  • Emmanuel Lesaffre

DOI
https://doi.org/10.1371/journal.pone.0168838
Journal volume & issue
Vol. 12, no. 1
p. e0168838

Abstract

Read online

Identifying the number of classes in Bayesian finite mixture models is a challenging problem. Several criteria have been proposed, such as adaptations of the deviance information criterion, marginal likelihoods, Bayes factors, and reversible jump MCMC techniques. It was recently shown that in overfitted mixture models, the overfitted latent classes will asymptotically become empty under specific conditions for the prior of the class proportions. This result may be used to construct a criterion for finding the true number of latent classes, based on the removal of latent classes that have negligible proportions. Unlike some alternative criteria, this criterion can easily be implemented in complex statistical models such as latent class mixed-effects models and multivariate mixture models using standard Bayesian software. We performed an extensive simulation study to develop practical guidelines to determine the appropriate number of latent classes based on the posterior distribution of the class proportions, and to compare this criterion with alternative criteria. The performance of the proposed criterion is illustrated using a data set of repeatedly measured hemoglobin values of blood donors.