-
-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHARMa implementation for new check_residuals()
function
#643
Conversation
@mccarthy-m-g I invited you to get write access, which should allow you to push to this PR, so we can collaboratively work on this. |
This comment was marked as outdated.
This comment was marked as outdated.
Thanks! Accepted. It will probably be a month or two before I can dedicate time to this, but I'll look at the issues you linked above today. |
I don't think you need to look into all those issues I linked to, that was more a reminder for myself ;-) We should focus on #595, and then I'll check to which extent other issues might be resolved. |
Sounds good, although I already started looking... 😅 (DHARMa doesn't support objects of class "lrm", so you can drop #471 from consideration) I think the |
Agreed. |
So the basic workflow would be:
Then we wouldn't need particular |
Yes and no. That's the basic workflow, but we'll want a print method for I'll add some rough commits with comments for context. |
If you look at the changes from those commits, particularly the vignette I added, does my rationale for why we don't want to prematurely return residuals in |
I see, yes, makes sense! I'm note sure how to align the library(performance)
library(glmmTMB)
model <- glmmTMB(
count ~ mined + spp + (1 | site),
family = poisson,
data = Salamanders
)
res <- simulate_residuals(model)
x <- check_normality(res)
plot(x) plot(DHARMa::simulateResiduals(model))
#> DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details Created on 2023-10-27 with reprex v2.0.2 |
DHARMa uses a uniform distribution, not a normal distribution, for its left plot. CRAN is down right now and I don't have qqplotr installed so I can't share a reprex, but this code should be what we want as the plot method for library(glmmTMB)
model <- glmmTMB(
count ~ mined + spp + (1 | site),
family = poisson,
data = Salamanders
)
simulated_residuals <- simulate_residuals(model)
plot_simulated_residuals <- function(x) {
dp <- list(min = 0, max = 1, lower.tail = TRUE, log.p = FALSE)
ggplot2::ggplot(
tibble::tibble(scaled_residuals = residuals(x)),
ggplot2::aes(sample = scaled_residuals)
) +
qqplotr::stat_qq_band(distribution = "unif", dparams = list(min = 0, max = 1), alpha = .2) +
qqplotr::stat_qq_line(distribution = "unif", dparams = dp, size = .8, colour = "#3aaf85") +
qqplotr::stat_qq_point(distribution = "unif", dparams = dp, size = .5, alpha = .8, colour = "#1b6ca8") +
ggplot2::labs(
title = "Uniformity of Residuals",
subtitle = "Dots should fall along the line",
x = "Standard Uniform Distribution Quantiles",
y = "Sample Quantiles"
) +
see::theme_lucid()
}
plot_simulated_residuals(simulated_residuals) I'm a bit skeptical about having a |
I would suggest we do |
Here's the implementation for library(performance)
# For statistical models ---------------------------------------------
# select only mpg and disp (continuous)
mt1 <- mtcars[, c(1, 3, 4)]
# create some fake outliers and attach outliers to main df
mt2 <- rbind(mt1, data.frame(
mpg = c(37, 40), disp = c(300, 400),
hp = c(110, 120)
))
# fit model with outliers
model <- lm(disp ~ mpg + hp, data = mt2)
res <- simulate_residuals(model)
check_outliers(res)
#> # Outliers detection
#>
#> Proportion of observed outliers: 2.94%
#> Proportion of expected outliers: 0.80%, 95% CI [0.07, 15.33]
#> No outliers were detected (p = 0.238).
DHARMa::testOutliers(res, plot = FALSE)
#>
#> DHARMa outlier test based on exact binomial test with approximate
#> expectations
#>
#> data: res
#> outliers at both margin(s) = 1, observations = 34, p-value = 0.2381
#> alternative hypothesis: true probability of success is not equal to 0.007968127
#> 95 percent confidence interval:
#> 0.0007443642 0.1532676696
#> sample estimates:
#> frequency of outliers (expected: 0.00796812749003984 )
#> 0.02941176
DHARMa::testOutliers(res, type = "bootstrap", plot = FALSE)
#>
#> DHARMa bootstrapped outlier test
#>
#> data: res
#> outliers at both margin(s) = 1, observations = 34, p-value = 0.78
#> alternative hypothesis: two.sided
#> percent confidence interval:
#> 0.00000000 0.05882353
#> sample estimates:
#> outlier frequency (expected: 0.015 )
#> 0.02941176
check_outliers(res, type = "bootstrap", iterations = 200)
#> # Outliers detection
#>
#> Proportion of observed outliers: 2.94%
#> Proportion of expected outliers: 1.32%, 95% CI [0.00, 5.88]
#> No outliers were detected (p = 0.690). Created on 2024-03-18 with reprex v2.1.0 |
The more I dive into the DHARMa topic, the more I realize that this approach is quite similar to what we aimed at with |
Fixes #595
I'm not sure, but I think this PR could also be related to following issues:
check_model()
on aglmmTMB
example #654check_model
check notmality in GLMs? #501check_model
for NB models #500check_zeroinflation
#632check_normality()
does not support models of classglmmTMB
? #613check_heteroskedasticity
plots #408To do
check_overdispersion()
based onsimulated_residuals()
plot()
forcheck_overdispersion()