Skip to content

Sinan Yildrim

Yee Whye Teh edited this page Jun 14, 2015 · 1 revision

Facilitating the Penalty Method for MCMC With Large Data

In this paper, we propose a computationally efficient and accurate subsampling strategy to be used within the Metropolis-Hastings (MH) algorithm in the context of Bayesian estimation of a parameter given a large dataset. Our method relies on the penalty method and consists of approximating the acceptance ratio of the accept/reject mechanism of the MH algorithm using a random subsample of the data and correcting in order to reduce bias. A naive implementation of this method leads to poor performance due to the variability introduced by subsampling. Our main contribution is to introduce a powerful variance reduction technique in this context relying on a Taylor approximation of the acceptance ratio. The cost of our algorithm to produce T samples is O(n + mT), where n is the data size and m is the subsample size, as opposed to O(nT) for the exact MH algorithm. We carry out a theoretical analysis and a numerical study which both suggest that m can be taken to grow sub-linearly (and in some cases may even be taken to be constant) in order to ensure uniform constant accuracy, resulting in a cost per iteration of the algorithm which is sub-linear. Our simulations show that T is not required to grow in order to maintain a given degree of accuracy.

Clone this wiki locally