Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

posterior predictive check for binomial glm with matrix response #645

Merged
merged 11 commits into from
Oct 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: performance
Title: Assessment of Regression Models Performance
Version: 0.10.7.1
Version: 0.10.7.2
Authors@R:
c(person(given = "Daniel",
family = "Lüdecke",
Expand Down Expand Up @@ -153,3 +153,4 @@ Config/Needs/website:
r-lib/pkgdown,
easystats/easystatstemplate
Config/rcmdcheck/ignore-inconsequential-notes: true
Remotes: easystats/insight
13 changes: 13 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
# performance 0.10.8

## Changes

* Changed behaviour of `check_predictions()` for models from binomial family,
to get comparable plots for different ways of outcome specification. Now,
if the outcome is a proportion, or defined as matrix of trials and successes,
the produced plots are the same (because the models should be the same, too).

## Bug fixes

* Fixed CRAN check errors.

# performance 0.10.7

## Breaking changes
Expand Down
12 changes: 11 additions & 1 deletion R/binned_residuals.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#' time-consuming. By default, `show_dots = NULL`. In this case `binned_residuals()`
#' tries to guess whether performance will be poor due to a very large model
#' and thus automatically shows or hides dots.
#' @param verbose Toggle warnings and messages.
#' @param ... Currently not used.
#'
#' @return A data frame representing the data that is mapped in the accompanying
Expand Down Expand Up @@ -83,11 +84,20 @@ binned_residuals <- function(model,
ci_type = c("exact", "gaussian", "boot"),
residuals = c("deviance", "pearson", "response"),
iterations = 1000,
verbose = TRUE,
...) {
# match arguments
ci_type <- match.arg(ci_type)
residuals <- match.arg(residuals)

# for non-bernoulli models, `"exact"` doesn't work
if (isFALSE(insight::model_info(model)$is_bernoulli)) {
ci_type <- "gaussian"
if (verbose) {
insight::format_alert("Using `ci_type = \"gaussian\"` because model is not bernoulli.")
}
}

fitted_values <- stats::fitted(model)
mf <- insight::get_data(model, verbose = FALSE)

Expand Down Expand Up @@ -186,7 +196,7 @@ binned_residuals <- function(model,
}
out <- out / n

quant <- stats::quantile(out, c((1 - ci) / 2, (1 + ci) / 2))
quant <- stats::quantile(out, c((1 - ci) / 2, (1 + ci) / 2), na.rm = TRUE)
c(CI_low = quant[1L], CI_high = quant[2L])
}

Expand Down
2 changes: 1 addition & 1 deletion R/check_model.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
#'
#' @details For Bayesian models from packages **rstanarm** or **brms**,
#' models will be "converted" to their frequentist counterpart, using
#' [`bayestestR::bayesian_as_frequentist`](https://easystats.github.io/bayestestR/reference/convert_bayesian_as_frequentist.html).

Check warning on line 52 in R/check_model.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/check_model.R,line=52,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 130 characters.
#' A more advanced model-check for Bayesian models will be implemented at a
#' later stage.
#'
Expand Down Expand Up @@ -77,7 +77,7 @@
#' plots are helpful to check model assumptions, they do not necessarily indicate
#' so-called "lack of fit", e.g. missed non-linear relationships or interactions.
#' Thus, it is always recommended to also look at
#' [effect plots, including partial residuals](https://strengejacke.github.io/ggeffects/articles/introduction_partial_residuals.html).

Check warning on line 80 in R/check_model.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/check_model.R,line=80,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 134 characters.
#'
#' @section Homogeneity of Variance:
#' This plot checks the assumption of equal variance (homoscedasticity). The
Expand Down Expand Up @@ -374,7 +374,7 @@
dat$INFLUENTIAL <- .influential_obs(model, threshold = threshold)
dat$PP_CHECK <- .safe(check_predictions(model, ...))
if (isTRUE(model_info$is_binomial)) {
dat$BINNED_RESID <- binned_residuals(model, ...)
dat$BINNED_RESID <- binned_residuals(model, verbose = verbose, ...)
}
if (isTRUE(model_info$is_count)) {
dat$OVERDISPERSION <- .diag_overdispersion(model)
Expand Down
40 changes: 24 additions & 16 deletions R/check_predictions.R
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@
minfo <- insight::model_info(object, verbose = FALSE)

# try to find sensible default for "type" argument
suggest_dots <- (minfo$is_bernoulli || minfo$is_count || minfo$is_ordinal || minfo$is_categorical || minfo$is_multinomial)

Check warning on line 107 in R/check_predictions.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/check_predictions.R,line=107,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 124 characters.
if (missing(type) && suggest_dots) {
type <- "discrete_interval"
}
Expand Down Expand Up @@ -197,10 +197,10 @@
out <- .check_re_formula(out, object, iterations, re_formula, verbose, ...)

# save information about model
if (!is.null(model_info)) {
minfo <- model_info
} else {
if (is.null(model_info)) {
minfo <- insight::model_info(object)
} else {
minfo <- model_info
}

# glmmTMB returns column matrix for bernoulli
Expand All @@ -215,9 +215,10 @@
}

if (is.null(out)) {
insight::format_error(
sprintf("Could not simulate responses. Maybe there is no `simulate()` for objects of class `%s`?", class(object)[1])
)
insight::format_error(sprintf(
"Could not simulate responses. Maybe there is no `simulate()` for objects of class `%s`?",
class(object)[1]
))
}

# get response data, and response term, to check for transformations
Expand Down Expand Up @@ -263,7 +264,7 @@
out <- tryCatch(
{
matrix_sim <- stats::simulate(object, nsim = iterations, re.form = re_formula, ...)
as.data.frame(sapply(matrix_sim, function(i) i[, 1] / i[, 2], simplify = TRUE))
as.data.frame(sapply(matrix_sim, function(i) i[, 1] / rowSums(i, na.rm = TRUE), simplify = TRUE))
},
error = function(e) {
NULL
Expand All @@ -274,9 +275,10 @@
out <- .check_re_formula(out, object, iterations, re_formula, verbose, ...)

if (is.null(out)) {
insight::format_error(
sprintf("Could not simulate responses. Maybe there is no `simulate()` for objects of class `%s`?", class(object)[1])
)
insight::format_error(sprintf(
"Could not simulate responses. Maybe there is no `simulate()` for objects of class `%s`?",
class(object)[1]
))
}

# get response data, and response term
Expand All @@ -285,13 +287,13 @@
)
resp_string <- insight::find_terms(object)$response

out$y <- response[, 1] / response[, 2]
out$y <- response[, 1] / rowSums(response, na.rm = TRUE)

# safe information about model
if (!is.null(model_info)) {
minfo <- model_info
} else {
if (is.null(model_info)) {
minfo <- insight::model_info(object)
} else {
minfo <- model_info
}

attr(out, "check_range") <- check_range
Expand Down Expand Up @@ -363,14 +365,20 @@
if (is.numeric(original)) {
if (min(replicated) > min(original)) {
insight::print_color(
insight::format_message("Warning: Minimum value of original data is not included in the replicated data.", "Model may not capture the variation of the data."),
insight::format_message(
"Warning: Minimum value of original data is not included in the replicated data.",
"Model may not capture the variation of the data."
),
"red"
)
}

if (max(replicated) < max(original)) {
insight::print_color(
insight::format_message("Warning: Maximum value of original data is not included in the replicated data.", "Model may not capture the variation of the data."),
insight::format_message(
"Warning: Maximum value of original data is not included in the replicated data.",
"Model may not capture the variation of the data."
),
"red"
)
}
Expand Down Expand Up @@ -423,7 +431,7 @@
if (is.null(plus_minus)) {
sims[] <- lapply(sims, exp)
} else {
sims[] <- lapply(sims, function(i) exp(i) - plus_minus)

Check warning on line 434 in R/check_predictions.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/check_predictions.R,line=434,col=30,[unnecessary_lambda_linter] Pass exp directly as a symbol to lapply() instead of wrapping it in an unnecessary anonymous function. For example, prefer lapply(DF, sum) to lapply(DF, function(x) sum(x)).
}
} else if (grepl("log1p(", resp_string, fixed = TRUE)) {
sims[] <- lapply(sims, expm1)
Expand Down
2 changes: 2 additions & 0 deletions R/looic.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#' @return A list with four elements, the ELPD, LOOIC and their standard errors.
#'
#' @examplesIf require("rstanarm")
#' \donttest{
#' model <- suppressWarnings(rstanarm::stan_glm(
#' mpg ~ wt + cyl,
#' data = mtcars,
Expand All @@ -20,6 +21,7 @@
#' refresh = 0
#' ))
#' looic(model)
#' }
#' @export
looic <- function(model, verbose = TRUE) {
insight::check_if_installed("loo")
Expand Down
Loading
Loading