Published on Wed Jun 13 2012

Hybrid Variational/Gibbs Collapsed Inference in Topic Models

Max Welling, Yee Whye Teh, Hilbert Kappen

Variational Bayesian inference and (collapsed) Gibbs sampling are the two most important classes of inference algorithms for Bayesian networks. Both have advantages and disadvantages: collapsed Gibbs sampling is unbiased but is inefficient for large count values. We propose a hybrid algorithm that combines the best of both.

0
0
0
Abstract

Variational Bayesian inference and (collapsed) Gibbs sampling are the two important classes of inference algorithms for Bayesian networks. Both have their advantages and disadvantages: collapsed Gibbs sampling is unbiased but is also inefficient for large count values and requires averaging over many samples to reduce variance. On the other hand, variational Bayesian inference is efficient and accurate for large count values but suffers from bias for small counts. We propose a hybrid algorithm that combines the best of both worlds: it samples very small counts and applies variational updates to large counts. This hybridization is shown to significantly improve testset perplexity relative to variational inference at no computational cost.

Tue Aug 02 2016
Machine Learning
Blocking Collapsed Gibbs Sampler for Latent Dirichlet Allocation Models
The latent Dirichlet allocation (LDA) model is a widely-used latent variable model in machine learning for text analysis. Inference for this model typically involves a single-site collapsed Gibbs sampling step for latent variables. The efficiency of the sampling is critical to the success of the model.
0
0
0
Fri May 10 2013
Machine Learning
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation
Algorithm is simpler and more efficient than the state of the currently used method. In experiments on large-scale corpora, the algorithm was found to converge faster and often to a better solution than the previous method.
0
0
0
Fri May 08 2015
Machine Learning
Dense Distributions from Sparse Samples: Improved Gibbs Sampling Parameter Estimators for LDA
We introduce a novel approach for estimating Latent Dirichlet Allocation(LDA) parameters from collapsed Gibbs samples (CGS) Our approach can be understood as adapting the soft clustering methodology of Collapsed Variational Bayes (CVB0) to CGS.
0
0
0
Tue Oct 28 2014
Machine Learning
Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling
The beta-negative binomial process (BNBP) is employed to partition a count vector into a latent random count matrix. The marginal probability distribution of the BNBP that governs the exchangeable random partitions of grouped data has not yet been developed. This paper introduces an exchangeable partition
0
0
0
Thu Jun 11 2015
Machine Learning
Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models
topic models, and more specifically the class of Latent Dirichlet Allocation (LDA), are widely used for probabilistic modeling of text. MCMC sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler.
0
0
0
Wed Sep 05 2012
Machine Learning
Augment-and-Conquer Negative Binomial Processes
We develop data augmentation methods unique to the negative binomial (NB) distribution. We show that the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization.
0
0
0