Sinan Yildrim

Facilitating the Penalty Method for MCMC With Large Data

In this paper, we propose a computationally efficient and accurate subsampling strategy to be used within the Metropolis-Hastings (MH) algorithm in the context of Bayesian estimation of a parameter given a large dataset. Our method relies on the penalty method and consists of approximating the acceptance ratio of the accept/reject mechanism of the MH algorithm using a random subsample of the data and correcting in order to reduce bias. A naive implementation of this method leads to poor performance due to the variability introduced by subsampling. Our main contribution is to introduce a powerful variance reduction technique in this context relying on a Taylor approximation of the acceptance ratio. The cost of our algorithm to produce T samples is O(n + mT), where n is the data size and m is the subsample size, as opposed to O(nT) for the exact MH algorithm. We carry out a theoretical analysis and a numerical study which both suggest that m can be taken to grow sub-linearly (and in some cases may even be taken to be constant) in order to ensure uniform constant accuracy, resulting in a cost per iteration of the algorithm which is sub-linear. Our simulations show that T is not required to grow in order to maintain a given degree of accuracy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sinan Yildrim

Facilitating the Penalty Method for MCMC With Large Data

Clone this wiki locally