Description
Thanks for this amazing package!
I found a small issue concerning performance::check_zeroinflation()
with models obtained from glmmTMB::glmmTMB()
using Negative Binomial distributions.
Minimal Working Example
data(Salamanders, package = "glmmTMB")
fit_mass <- MASS::glm.nb(count ~ spp + mined, data = Salamanders)
fit_tmb <- glmmTMB::glmmTMB(count ~ spp + mined, data = Salamanders, family=glmmTMB::nbinom2())
I fitted two models on the same data specifying a negative binomial distribution using respectivelyMASS::glm.nb()
and glmmTMB::glmmTMB()
. As you can see estimated parameters are identical.
coefficients(fit_mass)
(Intercept) sppPR sppDM sppEC-A sppEC-L sppDES-L sppDF minedno
-1.4605320 -1.2277880 0.4043891 -0.6707205 0.6387474 0.8215330 0.3600422 2.0380754
glmmTMB::fixef(fit_tmb)
Conditional model:
(Intercept) sppPR sppDM sppEC-A sppEC-L sppDES-L sppDF minedno
-1.4605 -1.2278 0.4044 -0.6707 0.6387 0.8215 0.3600 2.0381
Results from performance::check_zeroinflation()
, however, are very different:
performance::check_zeroinflation(fit_mass)
# Check for zero-inflation
Observed zeros: 387
Predicted zeros: 374
Ratio: 0.97
Model seems ok, ratio of observed and predicted zeros is within the tolerance range.
performance::check_zeroinflation(fit_tmb)
# Check for zero-inflation
Observed zeros: 387
Predicted zeros: 297
Ratio: 0.77
Model is underfitting zeros (probable zero-inflation).
This is a problem as the model are identical (almost). Digging into the code, I found that the problem in this part of the code
performance/R/check_zeroinflation.R
Lines 47 to 52 in 371f1bb
where x
is the model fit. Note that x$theta
works fine for glm.nb
objects but not glmmTMB
objects
fit_mass$theta
[1] 0.8052867
fit_tmb$theta
NULL
To properly get the dispersion parameter from glmmTMB
objects, the function stats::sigma()
can be used:
stats::sigma(fit_tmb)
[1] 0.8052869
Failing to get x$theta
, the function actually ends up evaluating density from a Poisson distribution, rather than negative binomial (see code linked).
Note that this issue may affect other functions as well where the dispersion parameter is involved