jrpg
is an R package that is used to facilitate the sampling of latent
Pólya-Gamma random variables in Bayesian models with logistic
likelihoods. This data augmentation strategy, originally proposed in
Polson et al.
(2013)
and Windle
(2013)
is very useful for writing custom MCMC algorithms (see also this
post by
Gregory Gunderson and this
post
by Louis Tiao for a blog-style overview).
This package seeks to compile existing implementations of
The Pólya-Gamma packages I have found to be most useful in my work thus far are:
- BayesLogit (TMSA Lab @ UIUC)
- pg (Jesse Windle)
- pgdraw (Enes Makalic and Daniel F Schmidt)
There are no doubt other implementations out there, but these are the
most popular in my experience. pg
uses the same underlying code as
BayesLogit
, but can be much faster. These implement a hybrid rpg()
and rpg_hybrid()
for BayesLogit
and pg
, respectively.
pgdraw
has a particularly fast implementation of the Devroye method,
and so it is optimal for integer
In general, there are 4 reliable sampling techniques for
- Sum-of-gammas approximation: best for very small values of
$b$ . If$X \sim PG(b,z)$ , then the density of$X$ can be expressed as an infinite sum of independent$\text{Gamma}(b,1)$ random variables. This method simply truncates the infinite sum to, say, 200 terms. - Devroye Method: best for integer values of
$b$ . Samples$b$ $PG(1,z)$ random variables using a method outlined in Polson et al. (2013). This method is exact, however for large enough$b$ it often suffices to use one of the following approximations. - Saddlepoint Approximation: best for intermediate values of
$b$ . Samples$X$ from by first creating an envelope over the$PG(b,z)$ density consisting of a mixture of right-truncated inverse-Gaussian and left-truncated gamma distributions. - Normal Approximation: best for large values of
$b$ . Samples$X$ from a normal distribution with mean and variance determined by the first and second moments of the$PG(b,z)$ distribution.
Some of these methods are faster than others, and the implementation in
each packages differs slightly, which also affects speed. The goal of
jrpg
is to compile the methods that work best (read: fastest) in each
scenario, per my personal experience. For parallel sampling, the pgR
package appears to be the only option. jrpg
does not use parallel code
by personal preference, as I prefer to reserve this for running multiple
chains when needed. Perhaps it will be added in the future.
The hybrid sampler for this package uses the following strategy:
jrpg
R package
Case |
jrpg Sampler |
Other Implementations |
---|---|---|
0 < b < 1 | Truncated Sum-of-Gammas |
|
$b 1, 2, , 13$ |
Devroye Method |
|
1 < b < 13 (non-integer) | Truncated Sum-of-Gammas |
|
13 ≤ b < 170 | Saddlepoint Approximation |
|
b ≥ 170 | Normal Approximation |
|
Hybrid Sampler for jrpg
R package
A few notes regarding our choices:
-
Case:
$1 < b < 13$ (non-integer). I have examined the use of a two-stage sampler for this case (Devroye method for integer component + truncated sum-of-gammas for the remainder), as outlined in the supplementary material of Polson et al. (2013). While this might be slightly more accurate, it is slower than just using the sum-of-gammas approach. We have not noticed any issue with this choice in practice. -
Case:
$13 \le b < 170$ . I have found the code which implements the saddlepoint approximation in the original packageBayesLogit
(and hencepg
) to work quite well 99.99% of the time. However, on rare occasions the approximation fails due to convergence issues. We have added a fail-safe that will default to the two-stage sampler described in the above point when this occurs.
The approach is similar to that which is used in the
PolyaGammaHybridSamplers.jl
package. The function corresponding to the hybrid sampler is this
package is jrpg()
.
For some examples on how to use jrpg
to facilitate Gibbs sampling in
custom MCMC routines, view the package vignette by running the following
after installation:
vignette(topic = 'mcmc-with-jrpg', package = 'jrpg')
You can install jrpg
and (optionally) build its vignettes like so:
install.packages('devtools')
devtools::install_github('jacobenglert/jrpg', build_vignettes = TRUE)
Note that installation will require compilation of Rcpp
and
RcppArmadillo
code. If you have used packages with Rcpp
dependence,
you will probably not have issues. If not, you will need to install a
working C++ compiler. This will be either
Xcode (Mac) or
Rtools (Windows).
-
Balamuta, J. J. (2021). Bayesian estimation of restricted latent class models: Extending priors, link functions, and structural models. University of Illinois Urbana-Champaign. Available at: https://www.ideals.illinois.edu/items/121209.
-
Enes Makalic and Daniel Schmidt (2016). High-Dimensional Bayesian Regularised Regression with the BayesReg Package. arXiv. Available at: https://arxiv.org/abs/1611.06649.
-
Polson, N.G., Scott, J.G. and Windle, J. (2013) ‘Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables’, Journal of the American Statistical Association, 108(504), pp. 1339–1349. Available at: https://doi.org/10.1080/01621459.2013.829001.
-
Windle, J. (2013) Forecasting High-Dimensional, Time-Varying Variance-Covariance Matrices with High-Frequency Data and Sampling Pólya-Gamma Random Variates for Posterior Distributions Derived from Logistic Likelihoods. The University of Texas at Austin.
-
Windle, J., Polson, N.G. and Scott, J.G. (2014) ‘Sampling Polya-Gamma random variates: alternate and approximate techniques’. arXiv. Available at: http://arxiv.org/abs/1405.0506.