Published on Fri May 06 2016

Likelihood Inflating Sampling Algorithm

Reihaneh Entezari, Radu V. Craiu, Jeffrey S. Rosenthal

Markov Chain Monte Carlo (MCMC) sampling from a massive data set can be computationally prohibitive. We introduce a new communication-free parallel method, thelihood Inflating Sampling Algorithm (LISA)

0
0
0
Abstract

Markov Chain Monte Carlo (MCMC) sampling from a posterior distribution corresponding to a massive data set can be computationally prohibitive since producing one sample requires a number of operations that is linear in the data size. In this paper, we introduce a new communication-free parallel method, the Likelihood Inflating Sampling Algorithm (LISA), that significantly reduces computational costs by randomly splitting the dataset into smaller subsets and running MCMC methods independently in parallel on each subset using different processors. Each processor will be used to run an MCMC chain that samples sub-posterior distributions which are defined using an "inflated" likelihood function. We develop a strategy for combining the draws from different sub-posteriors to study the full posterior of the Bayesian Additive Regression Trees (BART) model. The performance of the method is tested using both simulated and real data.

Mon Jan 28 2019
Machine Learning
Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets
Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods is too computationally intensive to handle large datasets. We propose the Scalable Metropolis-Hastings (SMH) kernel that exploits Gaussian concentration of the posterior.
0
0
0
Tue May 08 2018
Machine Learning
Subsampling Sequential Monte Carlo for Static Bayesian Models
We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions. Each update of the particle cloud consists of three steps:reweighting, resampling,
0
0
0
Tue Nov 19 2013
Machine Learning
Asymptotically Exact, Embarrassingly Parallel MCMC
Parallel Markov chain Monte Carlo (MCMC) algorithm in which subsets of data are processed independently, with very little communication.
0
0
0
Fri May 27 2016
Machine Learning
Merging MCMC Subposteriors through Gaussian-Process Approximations
Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation.
0
0
0
Tue Dec 17 2013
Machine Learning
Parallelizing MCMC via Weierstrass Sampler
A new Weierstrass sampler for parallel MCMC based on independent subsets. The new sampler approximates the full data posterior samples via combining the posterior draws from independent MCMC chains.
0
0
0
Wed Jun 10 2015
Machine Learning
Parallelizing MCMC with Random Partition Trees
The modern scale of data has brought new challenges to Bayesian inference. conventional MCMC algorithms are computationally very expensive for large data sets. A promising approach to solve this problem is EP-MCMC. The new algorithm applies random partition trees to combine the subset posterior draws.
0
0
0