diff --git a/.github/SUPPORT.md b/.github/SUPPORT.md new file mode 100644 index 000000000..4f52338fa --- /dev/null +++ b/.github/SUPPORT.md @@ -0,0 +1,29 @@ +# Getting help with `{performance}` + +Thanks for using `{performance}`. Before filing an issue, there are a few places +to explore and pieces to put together to make the process as smooth as possible. + +Start by making a minimal **repr**oducible **ex**ample using the +[reprex](http://reprex.tidyverse.org/) package. If you haven't heard of or used +reprex before, you're in for a treat! Seriously, reprex will make all of your +R-question-asking endeavors easier (which is a pretty insane ROI for the five to +ten minutes it'll take you to learn what it's all about). For additional reprex +pointers, check out the [Get help!](https://www.tidyverse.org/help/) resource +used by the tidyverse team. + +Armed with your reprex, the next step is to figure out where to ask: + + * If it's a question: start with StackOverflow. There are more people there to answer questions. + * If it's a bug: you're in the right place, file an issue. + * If you're not sure: let's [discuss](https://github.com/easystats/performance/discussions) it and try to figure it out! If your + problem _is_ a bug or a feature request, you can easily return here and + report it. + +Before opening a new issue, be sure to [search issues and pull requests](https://github.com/easystats/performance/issues) to make sure the +bug hasn't been reported and/or already fixed in the development version. By +default, the search will be pre-populated with `is:issue is:open`. You can +[edit the qualifiers](https://help.github.com/articles/searching-issues-and-pull-requests/) +(e.g. `is:pr`, `is:closed`) as needed. For example, you'd simply +remove `is:open` to search _all_ issues in the repo, open or closed. + +Thanks for your help! \ No newline at end of file diff --git a/CRAN-SUBMISSION b/CRAN-SUBMISSION index de1578819..44e7c417a 100644 --- a/CRAN-SUBMISSION +++ b/CRAN-SUBMISSION @@ -1,3 +1,3 @@ -Version: 0.10.3 -Date: 2023-04-06 14:07:07 UTC -SHA: 3198a3d95e27c0bc6470733dacf0496be7f96f43 +Version: 0.10.5 +Date: 2023-09-11 21:16:32 UTC +SHA: c3348f5c1183042544ebdfc7dbaa9489186c71ea diff --git a/DESCRIPTION b/DESCRIPTION index a99839a55..fa29f37d4 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Type: Package Package: performance Title: Assessment of Regression Models Performance -Version: 0.10.3.1 +Version: 0.10.5.2 Authors@R: c(person(given = "Daniel", family = "Lüdecke", @@ -70,7 +70,7 @@ Depends: R (>= 3.6) Imports: bayestestR (>= 0.13.0), - insight (>= 0.19.1), + insight (>= 0.19.4), datawizard (>= 0.7.0), methods, stats, @@ -86,6 +86,7 @@ Suggests: boot, brms, car, + carData, CompQuadForm, correlation, cplm, @@ -124,6 +125,7 @@ Suggests: patchwork, pscl, psych, + qqplotr (>= 0.0.6), randomForest, rmarkdown, rstanarm, @@ -147,4 +149,4 @@ Config/Needs/website: r-lib/pkgdown, easystats/easystatstemplate Config/rcmdcheck/ignore-inconsequential-notes: true -Remotes: easystats/insight, easystats/see +Remotes: easystats/see, easystats/parameters diff --git a/NAMESPACE b/NAMESPACE index 3479f911a..b5c3ee92b 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -45,6 +45,7 @@ S3method(check_collinearity,probitmfx) S3method(check_collinearity,zerocount) S3method(check_collinearity,zeroinfl) S3method(check_concurvity,gam) +S3method(check_convergence,"_glm") S3method(check_convergence,default) S3method(check_convergence,glmmTMB) S3method(check_convergence,merMod) diff --git a/NEWS.md b/NEWS.md index a96613ca2..857924e29 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,28 @@ +# performance (development version) + +# performance 0.10.5 + +## Changes to functions + +* More informative message for `test_*()` functions that "nesting" only refers + to fixed effects parameters and currently ignores random effects when detecting + nested models. + +* `check_outliers()` for `"ICS"` method is now more stable and less likely to + fail. + +* `check_convergence()` now works for *parsnip* `_glm` models. + +## Bug fixes + +* `check_collinearity()` did not work for hurdle- or zero-inflated models of + package *pscl* when model had no explicitly defined formula for the + zero-inflation model. + # performance 0.10.4 +## Changes to functions + * `icc()` and `r2_nakagawa()` gain a `ci_method` argument, to either calculate confidence intervals using `boot::boot()` (instead of `lmer::bootMer()`) when `ci_method = "boot"` or analytical confidence intervals @@ -8,6 +31,22 @@ bootstrapped intervals cannot be calculated at all. Note that the default computation method is preferred. +* `check_predictions()` accepts a `bandwidth` argument (smoothing bandwidth), + which is passed down to the `plot()` methods density-estimation. + +* `check_predictions()` gains a `type` argument, which is passed down to the + `plot()` method to change plot-type (density or discrete dots/intervals). + By default, `type` is set to `"default"` for models without discrete outcomes, + and else `type = "discrete_interval"`. + +* `performance_accuracy()` now includes confidence intervals, and reports those + by default (the standard error is no longer reported, but still included). + +## Bug fixes + +* Fixed issue in `check_collinearity()` for _fixest_ models that used `i()` + to create interactions in formulas. + # performance 0.10.3 ## New functions @@ -79,11 +118,6 @@ * `r2()` gets `ci`, to compute (analytical) confidence intervals for the R2. -* `check_predictions()` accepts a `bw` argument (smoothing bandwidth), which is - passed down to the `plot()` methods density-estimation. The default for the - smoothing bandwidth `bw` has changed from `"nrd0"` to `"nrd"`, which seems - to produce better fitting plots for non-gaussian models. - * The model underlying `check_distribution()` was now also trained to detect cauchy, half-cauchy and inverse-gamma distributions. diff --git a/R/binned_residuals.R b/R/binned_residuals.R index 4cb50a87b..355ab63c4 100644 --- a/R/binned_residuals.R +++ b/R/binned_residuals.R @@ -49,11 +49,13 @@ #' # look at the data frame #' as.data.frame(result) #' -#' \dontrun{ +#' \donttest{ #' # plot #' if (require("see")) { -#' plot(result) -#' }} +#' plot(result, show_dots = TRUE) +#' } +#' } +#' #' @export binned_residuals <- function(model, term = NULL, n_bins = NULL, ...) { fv <- stats::fitted(model) diff --git a/R/check_collinearity.R b/R/check_collinearity.R index ded2429ed..fed6fbf60 100644 --- a/R/check_collinearity.R +++ b/R/check_collinearity.R @@ -104,7 +104,7 @@ #' examples in R and Stan. 2nd edition. Chapman and Hall/CRC. #' #' - Vanhove, J. (2019). Collinearity isn't a disease that needs curing. -#' [webpage](https://janhove.github.io/analysis/2019/09/11/collinearity) +#' [webpage](https://janhove.github.io/posts/2019-09-11-collinearity/) #' #' - Zuur AF, Ieno EN, Elphick CS. A protocol for data exploration to avoid #' common statistical problems: Data exploration. Methods in Ecology and @@ -190,7 +190,12 @@ plot.check_collinearity <- function(x, ...) { # format table for each "ViF" group - this ensures that CIs are properly formatted x <- insight::format_table(x) - colnames(x)[4] <- "Increased SE" + x <- datawizard::data_rename( + x, + pattern = "SE_factor", + replacement = "Increased SE", + verbose = FALSE + ) if (length(low_vif)) { cat("\n") @@ -435,6 +440,14 @@ check_collinearity.zerocount <- function(x, f <- insight::find_formula(x) + # hurdle or zeroinfl model can have no zero-inflation formula, in which case + # we have the same formula as for conditional formula part + if (inherits(x, c("hurdle", "zeroinfl", "zerocount")) && + component == "zero_inflated" && + is.null(f[["zero_inflated"]])) { + f$zero_inflated <- f$conditional + } + if (inherits(x, "mixor")) { terms <- labels(x$terms) } else { diff --git a/R/check_convergence.R b/R/check_convergence.R index f86bff381..1c2ab5f25 100644 --- a/R/check_convergence.R +++ b/R/check_convergence.R @@ -46,31 +46,27 @@ #' #' @family functions to check model assumptions and and assess model quality #' -#' @examples -#' if (require("lme4")) { -#' data(cbpp) -#' set.seed(1) -#' cbpp$x <- rnorm(nrow(cbpp)) -#' cbpp$x2 <- runif(nrow(cbpp)) +#' @examplesIf require("lme4") && require("glmmTMB") +#' data(cbpp, package = "lme4") +#' set.seed(1) +#' cbpp$x <- rnorm(nrow(cbpp)) +#' cbpp$x2 <- runif(nrow(cbpp)) #' -#' model <- glmer( -#' cbind(incidence, size - incidence) ~ period + x + x2 + (1 + x | herd), -#' data = cbpp, -#' family = binomial() -#' ) +#' model <- lme4::glmer( +#' cbind(incidence, size - incidence) ~ period + x + x2 + (1 + x | herd), +#' data = cbpp, +#' family = binomial() +#' ) #' -#' check_convergence(model) -#' } +#' check_convergence(model) #' -#' \dontrun{ -#' if (require("glmmTMB")) { -#' model <- glmmTMB( -#' Sepal.Length ~ poly(Petal.Width, 4) * poly(Petal.Length, 4) + -#' (1 + poly(Petal.Width, 4) | Species), -#' data = iris -#' ) -#' check_convergence(model) -#' } +#' \donttest{ +#' model <- suppressWarnings(glmmTMB::glmmTMB( +#' Sepal.Length ~ poly(Petal.Width, 4) * poly(Petal.Length, 4) + +#' (1 + poly(Petal.Width, 4) | Species), +#' data = iris +#' )) +#' check_convergence(model) #' } #' @export check_convergence <- function(x, tolerance = 0.001, ...) { @@ -107,3 +103,9 @@ check_convergence.glmmTMB <- function(x, ...) { # https://github.com/glmmTMB/glmmTMB/issues/275 isTRUE(x$sdr$pdHess) } + + +#' @export +check_convergence._glm <- function(x, ...) { + isTRUE(x$fit$converged) +} diff --git a/R/check_distribution.R b/R/check_distribution.R index 976fedf7c..e43b28036 100644 --- a/R/check_distribution.R +++ b/R/check_distribution.R @@ -48,15 +48,14 @@ NULL #' There is a `plot()` method, which shows the probabilities of all predicted #' distributions, however, only if the probability is greater than zero. #' -#' @examples -#' if (require("lme4") && require("parameters") && -#' require("see") && require("patchwork") && require("randomForest")) { -#' data(sleepstudy) +#' @examplesIf require("lme4") && require("parameters") && require("randomForest") +#' data(sleepstudy, package = "lme4") +#' model <<- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy) +#' check_distribution(model) +#' +#' @examplesIf require("see") && require("patchwork") && require("randomForest") +#' plot(check_distribution(model)) #' -#' model <<- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) -#' check_distribution(model) -#' plot(check_distribution(model)) -#' } #' @export check_distribution <- function(model) { UseMethod("check_distribution") @@ -196,23 +195,23 @@ check_distribution.numeric <- function(model) { x <- x[!is.na(x)] data.frame( - "SD" = stats::sd(x), - "MAD" = stats::mad(x, constant = 1), - "Mean_Median_Distance" = mean(x) - stats::median(x), - "Mean_Mode_Distance" = mean(x) - as.numeric(bayestestR::map_estimate(x, bw = "nrd0")), - "SD_MAD_Distance" = stats::sd(x) - stats::mad(x, constant = 1), - "Var_Mean_Distance" = stats::var(x) - mean(x), - "Range_SD" = diff(range(x)) / stats::sd(x), - "Range" = diff(range(x)), - "IQR" = stats::IQR(x), - "Skewness" = .skewness(x), - "Kurtosis" = .kurtosis(x), - "Uniques" = length(unique(x)) / length(x), - "N_Uniques" = length(unique(x)), - "Min" = min(x), - "Max" = max(x), - "Proportion_Positive" = sum(x >= 0) / length(x), - "Integer" = all(.is_integer(x)) + SD = stats::sd(x), + MAD = stats::mad(x, constant = 1), + Mean_Median_Distance = mean(x) - stats::median(x), + Mean_Mode_Distance = mean(x) - as.numeric(bayestestR::map_estimate(x, bw = "nrd0")), + SD_MAD_Distance = stats::sd(x) - stats::mad(x, constant = 1), + Var_Mean_Distance = stats::var(x) - mean(x), + Range_SD = diff(range(x)) / stats::sd(x), + Range = diff(range(x)), + IQR = stats::IQR(x), + Skewness = .skewness(x), + Kurtosis = .kurtosis(x), + Uniques = length(unique(x)) / length(x), + N_Uniques = length(unique(x)), + Min = min(x), + Max = max(x), + Proportion_Positive = sum(x >= 0) / length(x), + Integer = all(.is_integer(x)) ) } diff --git a/R/check_factorstructure.R b/R/check_factorstructure.R index 4f64f37d6..911fca434 100644 --- a/R/check_factorstructure.R +++ b/R/check_factorstructure.R @@ -189,10 +189,22 @@ check_sphericity_bartlett <- function(x, n = NULL, ...) { out <- list(chisq = statistic, p = pval, dof = df) if (pval < 0.001) { - text <- sprintf("Bartlett's test of sphericity suggests that there is sufficient significant correlation in the data for factor analysis (Chisq(%i) = %.2f, %s).", df, statistic, insight::format_p(pval)) + text <- + sprintf( + "Bartlett's test of sphericity suggests that there is sufficient significant correlation in the data for factor analysis (Chisq(%i) = %.2f, %s).", + df, + statistic, + insight::format_p(pval) + ) color <- "green" } else { - text <- sprintf("Bartlett's test of sphericity suggests that there is not enough significant correlation in the data for factor analysis (Chisq(%i) = %.2f, %s).", df, statistic, insight::format_p(pval)) + text <- + sprintf( + "Bartlett's test of sphericity suggests that there is not enough significant correlation in the data for factor analysis (Chisq(%i) = %.2f, %s).", + df, + statistic, + insight::format_p(pval) + ) color <- "red" } diff --git a/R/check_heterogeneity_bias.R b/R/check_heterogeneity_bias.R index 75b6fce72..b87aa9962 100644 --- a/R/check_heterogeneity_bias.R +++ b/R/check_heterogeneity_bias.R @@ -2,7 +2,7 @@ #' #' `check_heterogeneity_bias()` checks if model predictors or variables may #' cause a heterogeneity bias, i.e. if variables have a within- and/or -#' between-effect. +#' between-effect (_Bell and Jones, 2015_). #' #' @param x A data frame or a mixed model object. #' @param select Character vector (or formula) with names of variables to select @@ -15,7 +15,12 @@ #' @seealso #' For further details, read the vignette #' and also -#' see documentation for `?datawizard::demean`. +#' see documentation for [`datawizard::demean()`]. +#' +#' @references +#' - Bell A, Jones K. 2015. Explaining Fixed Effects: Random Effects +#' Modeling of Time-Series Cross-Sectional and Panel Data. Political Science +#' Research and Methods, 3(1), 133–153. #' #' @examples #' data(iris) diff --git a/R/check_heteroscedasticity.R b/R/check_heteroscedasticity.R index 689057004..c3fb8a19b 100644 --- a/R/check_heteroscedasticity.R +++ b/R/check_heteroscedasticity.R @@ -11,12 +11,14 @@ #' @return The p-value of the test statistics. A p-value < 0.05 indicates a #' non-constant variance (heteroskedasticity). #' -#' @note There is also a [`plot()`-method](https://easystats.github.io/see/articles/performance.html) implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. +#' @note There is also a [`plot()`-method](https://easystats.github.io/see/articles/performance.html) +#' implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. #' #' @details This test of the hypothesis of (non-)constant error is also called #' *Breusch-Pagan test* (\cite{1979}). #' -#' @references Breusch, T. S., and Pagan, A. R. (1979) A simple test for heteroscedasticity and random coefficient variation. Econometrica 47, 1287-1294. +#' @references Breusch, T. S., and Pagan, A. R. (1979) A simple test for heteroscedasticity +#' and random coefficient variation. Econometrica 47, 1287-1294. #' #' @family functions to check model assumptions and and assess model quality #' @@ -84,9 +86,21 @@ check_heteroscedasticity.default <- function(x, ...) { #' @export print.check_heteroscedasticity <- function(x, ...) { if (x < 0.05) { - insight::print_color(sprintf("Warning: Heteroscedasticity (non-constant error variance) detected (%s).\n", insight::format_p(x)), "red") + insight::print_color( + sprintf( + "Warning: Heteroscedasticity (non-constant error variance) detected (%s).\n", + insight::format_p(x) + ), + "red" + ) } else { - insight::print_color(sprintf("OK: Error variance appears to be homoscedastic (%s).\n", insight::format_p(x)), "green") + insight::print_color( + sprintf( + "OK: Error variance appears to be homoscedastic (%s).\n", + insight::format_p(x) + ), + "green" + ) } invisible(x) } diff --git a/R/check_homogeneity.R b/R/check_homogeneity.R index 0948865e6..c3a839ab7 100644 --- a/R/check_homogeneity.R +++ b/R/check_homogeneity.R @@ -16,7 +16,8 @@ #' @return Invisibly returns the p-value of the test statistics. A p-value < #' 0.05 indicates a significant difference in the variance between the groups. #' -#' @note There is also a [`plot()`-method](https://easystats.github.io/see/articles/performance.html) implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. +#' @note There is also a [`plot()`-method](https://easystats.github.io/see/articles/performance.html) +#' implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. #' #' @family functions to check model assumptions and and assess model quality #' @@ -90,9 +91,9 @@ check_homogeneity.default <- function(x, method = c("bartlett", "fligner", "leve method.string <- switch(method, - "bartlett" = "Bartlett Test", - "fligner" = "Fligner-Killeen Test", - "levene" = "Levene's Test" + bartlett = "Bartlett Test", + fligner = "Fligner-Killeen Test", + levene = "Levene's Test" ) attr(p.val, "data") <- x diff --git a/R/check_itemscale.R b/R/check_itemscale.R index 182b681bc..8d8b71082 100644 --- a/R/check_itemscale.R +++ b/R/check_itemscale.R @@ -43,17 +43,16 @@ #' - Trochim WMK (2008) Types of Reliability. #' ([web](https://conjointly.com/kb/types-of-reliability/)) #' -#' @examples +#' @examplesIf require("parameters") && require("psych") #' # data generation from '?prcomp', slightly modified #' C <- chol(S <- toeplitz(0.9^(0:15))) #' set.seed(17) #' X <- matrix(rnorm(1600), 100, 16) #' Z <- X %*% C -#' if (require("parameters") && require("psych")) { -#' pca <- principal_components(as.data.frame(Z), rotation = "varimax", n = 3) -#' pca -#' check_itemscale(pca) -#' } +#' +#' pca <- principal_components(as.data.frame(Z), rotation = "varimax", n = 3) +#' pca +#' check_itemscale(pca) #' @export check_itemscale <- function(x) { if (!inherits(x, "parameters_pca")) { diff --git a/R/check_model.R b/R/check_model.R index 26c720592..bbf6c6a84 100644 --- a/R/check_model.R +++ b/R/check_model.R @@ -27,14 +27,16 @@ #' for dots, and third color for outliers or extreme values. #' @param theme String, indicating the name of the plot-theme. Must be in the #' format `"package::theme_name"` (e.g. `"ggplot2::theme_minimal"`). -#' @param detrend Should QQ/PP plots be detrended? +#' @param detrend Logical. Should Q-Q/P-P plots be detrended? Defaults to +#' `TRUE`. #' @param show_dots Logical, if `TRUE`, will show data points in the plot. Set #' to `FALSE` for models with many observations, if generating the plot is too #' time-consuming. By default, `show_dots = NULL`. In this case `check_model()` #' tries to guess whether performance will be poor due to a very large model #' and thus automatically shows or hides dots. -#' @param verbose Toggle off warnings. +#' @param verbose If `FALSE` (default), suppress most warning messages. #' @param ... Currently not used. +#' @inheritParams check_predictions #' #' @return The data frame that is used for plotting. #' @@ -45,10 +47,12 @@ #' `check_normality()` etc.) to get informative messages and warnings. #' #' @details For Bayesian models from packages **rstanarm** or **brms**, -#' models will be "converted" to their frequentist counterpart, using -#' [`bayestestR::bayesian_as_frequentist`](https://easystats.github.io/bayestestR/reference/convert_bayesian_as_frequentist.html). -#' A more advanced model-check for Bayesian models will be implemented at a -#' later stage. +#' models will be "converted" to their frequentist counterpart, using +#' [`bayestestR::bayesian_as_frequentist`](https://easystats.github.io/bayestestR/reference/convert_bayesian_as_frequentist.html). +#' A more advanced model-check for Bayesian models will be implemented at a +#' later stage. +#' +#' See also the related [vignette](https://easystats.github.io/performance/articles/check_model.html). #' #' @section Posterior Predictive Checks: #' Posterior predictive checks can be used to look for systematic discrepancies @@ -136,20 +140,14 @@ #' #' @family functions to check model assumptions and and assess model quality #' -#' @examples -#' \dontrun{ +#' @examplesIf require("lme4") +#' \donttest{ #' m <- lm(mpg ~ wt + cyl + gear + disp, data = mtcars) #' check_model(m) #' -#' if (require("lme4")) { -#' m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) -#' check_model(m, panel = FALSE) -#' } -#' -#' if (require("rstanarm")) { -#' m <- stan_glm(mpg ~ wt + gear, data = mtcars, chains = 2, iter = 200) -#' check_model(m) -#' } +#' data(sleepstudy, package = "lme4") +#' m <- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy) +#' check_model(m, panel = FALSE) #' } #' @export check_model <- function(x, ...) { @@ -171,9 +169,11 @@ check_model.default <- function(x, dot_alpha = 0.8, colors = c("#3aaf85", "#1b6ca8", "#cd201f"), theme = "see::theme_lucid", - detrend = FALSE, + detrend = TRUE, show_dots = NULL, - verbose = TRUE, + bandwidth = "nrd", + type = "density", + verbose = FALSE, ...) { # check model formula if (verbose) { @@ -201,6 +201,12 @@ check_model.default <- function(x, insight::format_error(paste0("`check_model()` not implemented for models of class `", class(x)[1], "` yet.")) } + # try to find sensible default for "type" argument + suggest_dots <- (minfo$is_bernoulli || minfo$is_count || minfo$is_ordinal || minfo$is_categorical || minfo$is_multinomial) + if (missing(type) && suggest_dots) { + type <- "discrete_interval" + } + # set default for show_dots, based on "model size" if (is.null(show_dots)) { n <- .safe(insight::n_obs(x)) @@ -219,6 +225,8 @@ check_model.default <- function(x, attr(ca, "theme") <- theme attr(ca, "model_info") <- minfo attr(ca, "overdisp_type") <- list(...)$plot_type + attr(ca, "bandwidth") <- bandwidth + attr(ca, "type") <- type ca } @@ -256,6 +264,8 @@ check_model.stanreg <- function(x, theme = "see::theme_lucid", detrend = FALSE, show_dots = NULL, + bandwidth = "nrd", + type = "density", verbose = TRUE, ...) { check_model(bayestestR::bayesian_as_frequentist(x), @@ -269,6 +279,8 @@ check_model.stanreg <- function(x, theme = theme, detrend = detrend, show_dots = show_dots, + bandwidth = bandwidth, + type = type, verbose = verbose, ... ) @@ -291,6 +303,8 @@ check_model.model_fit <- function(x, theme = "see::theme_lucid", detrend = FALSE, show_dots = NULL, + bandwidth = "nrd", + type = "density", verbose = TRUE, ...) { check_model( @@ -305,6 +319,8 @@ check_model.model_fit <- function(x, theme = theme, detrend = detrend, show_dots = show_dots, + bandwidth = bandwidth, + type = type, verbose = verbose, ... ) diff --git a/R/check_multimodal.R b/R/check_multimodal.R index 9223ee59a..55e70f78f 100644 --- a/R/check_multimodal.R +++ b/R/check_multimodal.R @@ -10,33 +10,29 @@ #' @param x A numeric vector or a data frame. #' @param ... Arguments passed to or from other methods. #' -#' @examples -#' \dontrun{ -#' if (require("multimode")) { -#' # Univariate -#' x <- rnorm(1000) -#' check_multimodal(x) -#' } +#' @examplesIf require("multimode") && require("mclust") +#' \donttest{ +#' # Univariate +#' x <- rnorm(1000) +#' check_multimodal(x) #' -#' if (require("multimode") && require("mclust")) { -#' x <- c(rnorm(1000), rnorm(1000, 2)) -#' check_multimodal(x) +#' x <- c(rnorm(1000), rnorm(1000, 2)) +#' check_multimodal(x) #' -#' # Multivariate -#' m <- data.frame( -#' x = rnorm(200), -#' y = rbeta(200, 2, 1) -#' ) -#' plot(m$x, m$y) -#' check_multimodal(m) +#' # Multivariate +#' m <- data.frame( +#' x = rnorm(200), +#' y = rbeta(200, 2, 1) +#' ) +#' plot(m$x, m$y) +#' check_multimodal(m) #' -#' m <- data.frame( -#' x = c(rnorm(100), rnorm(100, 4)), -#' y = c(rbeta(100, 2, 1), rbeta(100, 1, 4)) -#' ) -#' plot(m$x, m$y) -#' check_multimodal(m) -#' } +#' m <- data.frame( +#' x = c(rnorm(100), rnorm(100, 4)), +#' y = c(rbeta(100, 2, 1), rbeta(100, 1, 4)) +#' ) +#' plot(m$x, m$y) +#' check_multimodal(m) #' } #' @references #' - Ameijeiras-Alonso, J., Crujeiras, R. M., and Rodríguez-Casal, A. (2019). diff --git a/R/check_normality.R b/R/check_normality.R index 0f26a557d..9dc00d03f 100644 --- a/R/check_normality.R +++ b/R/check_normality.R @@ -24,19 +24,18 @@ #' (e.g. Q-Q plots) are preferable. For generalized linear models, no formal #' statistical test is carried out. Rather, there's only a `plot()` method for #' GLMs. This plot shows a half-normal Q-Q plot of the absolute value of the -#' standardized deviance residuals is shown (being in line with changes in +#' standardized deviance residuals is shown (in line with changes in #' `plot.lm()` for R 4.3+). #' -#' @examples +#' @examplesIf require("see") #' m <<- lm(mpg ~ wt + cyl + gear + disp, data = mtcars) #' check_normality(m) #' #' # plot results -#' if (require("see")) { -#' x <- check_normality(m) -#' plot(x) -#' } -#' \dontrun{ +#' x <- check_normality(m) +#' plot(x) +#' +#' \donttest{ #' # QQ-plot #' plot(check_normality(m), type = "qq") #' diff --git a/R/check_outliers.R b/R/check_outliers.R index 902ea5bbd..c1af3bb1f 100644 --- a/R/check_outliers.R +++ b/R/check_outliers.R @@ -299,7 +299,7 @@ #' group_iris <- datawizard::data_group(iris, "Species") #' check_outliers(group_iris) #' -#' \dontrun{ +#' \donttest{ #' # You can also run all the methods #' check_outliers(data, method = "all") #' @@ -1122,20 +1122,23 @@ check_outliers.data.frame <- function(x, ID.names = ID.names )) - count.table <- datawizard::data_filter( - out$data_ics, "Outlier_ICS > 0.5" - ) + # make sure we have valid results + if (!is.null(out)) { + count.table <- datawizard::data_filter( + out$data_ics, "Outlier_ICS > 0.5" + ) - count.table <- datawizard::data_remove( - count.table, "ICS", - regex = TRUE, as_data_frame = TRUE - ) + count.table <- datawizard::data_remove( + count.table, "ICS", + regex = TRUE, as_data_frame = TRUE + ) - if (nrow(count.table) >= 1) { - count.table$n_ICS <- "(Multivariate)" - } + if (nrow(count.table) >= 1) { + count.table$n_ICS <- "(Multivariate)" + } - outlier_count$ics <- count.table + outlier_count$ics <- count.table + } } # OPTICS @@ -1787,10 +1790,16 @@ check_outliers.metabin <- check_outliers.metagen } else { insight::print_color(sprintf("`check_outliers()` does not support models of class `%s`.\n", class(x)[1]), "red") } + return(NULL) } # Get results - cutoff <- outliers@ics.dist.cutoff + cutoff <- .safe(outliers@ics.dist.cutoff) + # sanity check + if (is.null(cutoff)) { + insight::print_color("Could not detect cut-off for outliers.\n", "red") + return(NULL) + } out$Distance_ICS <- outliers@ics.distances out$Outlier_ICS <- as.numeric(out$Distance_ICS > cutoff) diff --git a/R/check_predictions.R b/R/check_predictions.R index 2e875291a..d52b8378a 100644 --- a/R/check_predictions.R +++ b/R/check_predictions.R @@ -23,6 +23,15 @@ #' be considered in the simulated data. If `NULL` (default), condition #' on all random effects. If `NA` or `~0`, condition on no random #' effects. See `simulate()` in **lme4**. +#' @param bandwidth A character string indicating the smoothing bandwidth to +#' be used. Unlike `stats::density()`, which used `"nrd0"` as default, the +#' default used here is `"nrd"` (which seems to give more plausible results +#' for non-Gaussian models). When problems with plotting occur, try to change +#' to a different value. +#' @param type Plot type for the posterior predictive checks plot. Can be `"density"`, +#' `"discrete_dots"`, `"discrete_interval"` or `"discrete_both"` (the `discrete_*` +#' options are appropriate for models with discrete - binary, integer or ordinal +#' etc. - outcomes). #' @param verbose Toggle warnings. #' @param ... Passed down to `simulate()`. #' @@ -57,17 +66,24 @@ #' - Gelman, A., Hill, J., and Vehtari, A. (2020). Regression and Other Stories. #' Cambridge University Press. #' -#' @examples -#' library(performance) +#' @examplesIf require("see") +#' # linear model #' model <- lm(mpg ~ disp, data = mtcars) -#' if (require("see")) { -#' check_predictions(model) -#' } +#' check_predictions(model) +#' +#' # discrete/integer outcome +#' set.seed(99) +#' d <- iris +#' d$skewed <- rpois(150, 1) +#' model <- glm( +#' skewed ~ Species + Petal.Length + Petal.Width, +#' family = poisson(), +#' data = d +#' ) +#' check_predictions(model, type = "discrete_both") +#' #' @export -check_predictions <- function(object, - iterations = 50, - check_range = FALSE, - ...) { +check_predictions <- function(object, ...) { UseMethod("check_predictions") } @@ -77,13 +93,26 @@ check_predictions.default <- function(object, iterations = 50, check_range = FALSE, re_formula = NULL, + bandwidth = "nrd", + type = "density", verbose = TRUE, ...) { # check for valid input .is_model_valid(object) - if (isTRUE(insight::model_info(object, verbose = FALSE)$is_bayesian) && - isFALSE(inherits(object, "BFBayesFactor"))) { + # retrieve model information + minfo <- insight::model_info(object, verbose = FALSE) + + # try to find sensible default for "type" argument + suggest_dots <- (minfo$is_bernoulli || minfo$is_count || minfo$is_ordinal || minfo$is_categorical || minfo$is_multinomial) + if (missing(type) && suggest_dots) { + type <- "discrete_interval" + } + + # args + type <- match.arg(type, choices = c("density", "discrete_dots", "discrete_interval", "discrete_both")) + + if (isTRUE(minfo$is_bayesian) && isFALSE(inherits(object, "BFBayesFactor"))) { insight::check_if_installed( "bayesplot", "to create posterior prediction plots for Stan models" @@ -95,7 +124,10 @@ check_predictions.default <- function(object, iterations = iterations, check_range = check_range, re_formula = re_formula, + bandwidth = bandwidth, + type = type, verbose = verbose, + model_info = minfo, ... ) } @@ -106,6 +138,7 @@ check_predictions.BFBayesFactor <- function(object, iterations = 50, check_range = FALSE, re_formula = NULL, + bandwidth = "nrd", verbose = TRUE, ...) { everything_we_need <- .get_bfbf_predictions(object, iterations = iterations) @@ -125,6 +158,7 @@ check_predictions.BFBayesFactor <- function(object, out <- as.data.frame(yrep) colnames(out) <- paste0("sim_", seq_len(ncol(out))) out$y <- y + attr(out, "bandwidth") <- bandwidth attr(out, "check_range") <- check_range class(out) <- c("performance_pp_check", "see_performance_pp_check", class(out)) out @@ -146,11 +180,14 @@ pp_check.lm <- function(object, iterations = 50, check_range = FALSE, re_formula = NULL, + bandwidth = "nrd", + type = "density", verbose = TRUE, + model_info = NULL, ...) { # if we have a matrix-response, continue here... if (grepl("^cbind\\((.*)\\)", insight::find_response(object, combine = TRUE))) { - return(pp_check.glm(object, iterations, check_range, re_formula, verbose, ...)) + return(pp_check.glm(object, iterations, check_range, re_formula, bandwidth, type, verbose, model_info, ...)) } # else, proceed as usual @@ -159,8 +196,15 @@ pp_check.lm <- function(object, # sanity check, for mixed models, where re.form = NULL (default) might fail out <- .check_re_formula(out, object, iterations, re_formula, verbose, ...) + # save information about model + if (!is.null(model_info)) { + minfo <- model_info + } else { + minfo <- insight::model_info(object) + } + # glmmTMB returns column matrix for bernoulli - if (inherits(object, "glmmTMB") && insight::model_info(object)$is_binomial && !is.null(out)) { + if (inherits(object, "glmmTMB") && minfo$is_binomial && !is.null(out)) { out <- as.data.frame(lapply(out, function(i) { if (is.matrix(i)) { i[, 1] @@ -190,6 +234,9 @@ pp_check.lm <- function(object, attr(out, "check_range") <- check_range attr(out, "response_name") <- resp_string + attr(out, "bandwidth") <- bandwidth + attr(out, "model_info") <- minfo + attr(out, "type") <- type class(out) <- c("performance_pp_check", "see_performance_pp_check", class(out)) out } @@ -199,11 +246,14 @@ pp_check.glm <- function(object, iterations = 50, check_range = FALSE, re_formula = NULL, + bandwidth = "nrd", + type = "density", verbose = TRUE, + model_info = NULL, ...) { # if we have no matrix-response, continue here... if (!grepl("^cbind\\((.*)\\)", insight::find_response(object, combine = TRUE))) { - return(pp_check.lm(object, iterations, check_range, re_formula, ...)) + return(pp_check.lm(object, iterations, check_range, re_formula, bandwidth, type, verbose, model_info, ...)) } # else, process matrix response. for matrix response models, we compute @@ -237,8 +287,18 @@ pp_check.glm <- function(object, out$y <- response[, 1] / response[, 2] + # safe information about model + if (!is.null(model_info)) { + minfo <- model_info + } else { + minfo <- insight::model_info(object) + } + attr(out, "check_range") <- check_range attr(out, "response_name") <- resp_string + attr(out, "bandwidth") <- bandwidth + attr(out, "model_info") <- minfo + attr(out, "type") <- type class(out) <- c("performance_pp_check", "see_performance_pp_check", class(out)) out } diff --git a/R/check_sphericity.R b/R/check_sphericity.R index 5a9ebba95..a087a1a5f 100644 --- a/R/check_sphericity.R +++ b/R/check_sphericity.R @@ -10,15 +10,14 @@ #' @return Invisibly returns the p-values of the test statistics. A p-value < #' 0.05 indicates a violation of sphericity. #' -#' @examples -#' if (require("car")) { -#' soils.mod <- lm( -#' cbind(pH, N, Dens, P, Ca, Mg, K, Na, Conduc) ~ Block + Contour * Depth, -#' data = Soils -#' ) +#' @examplesIf require("car") && require("carData") +#' data(Soils, package = "carData") +#' soils.mod <- lm( +#' cbind(pH, N, Dens, P, Ca, Mg, K, Na, Conduc) ~ Block + Contour * Depth, +#' data = Soils +#' ) #' -#' check_sphericity(Manova(soils.mod)) -#' } +#' check_sphericity(Manova(soils.mod)) #' @export check_sphericity <- function(x, ...) { UseMethod("check_sphericity") diff --git a/R/check_symmetry.R b/R/check_symmetry.R index 9178e82c5..dbe77dac9 100644 --- a/R/check_symmetry.R +++ b/R/check_symmetry.R @@ -9,7 +9,7 @@ #' @param ... Not used. #' #' @examples -#' V <- wilcox.test(mtcars$mpg) +#' V <- suppressWarnings(wilcox.test(mtcars$mpg)) #' check_symmetry(V) #' #' @export diff --git a/R/check_zeroinflation.R b/R/check_zeroinflation.R index fbf399939..f0f19b369 100644 --- a/R/check_zeroinflation.R +++ b/R/check_zeroinflation.R @@ -21,12 +21,10 @@ #' #' @family functions to check model assumptions and and assess model quality #' -#' @examples -#' if (require("glmmTMB")) { -#' data(Salamanders) -#' m <- glm(count ~ spp + mined, family = poisson, data = Salamanders) -#' check_zeroinflation(m) -#' } +#' @examplesIf require("glmmTMB") +#' data(Salamanders, package = "glmmTMB") +#' m <- glm(count ~ spp + mined, family = poisson, data = Salamanders) +#' check_zeroinflation(m) #' @export check_zeroinflation <- function(x, tolerance = 0.05) { # check if we have poisson diff --git a/R/compare_performance.R b/R/compare_performance.R index ca1dcf9ff..76b0b329f 100644 --- a/R/compare_performance.R +++ b/R/compare_performance.R @@ -70,7 +70,7 @@ #' _Model selection and multimodel inference: A practical information-theoretic approach_ (2nd ed.). #' Springer-Verlag. \doi{10.1007/b97636} #' -#' @examples +#' @examplesIf require("lme4") #' data(iris) #' lm1 <- lm(Sepal.Length ~ Species, data = iris) #' lm2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris) @@ -78,12 +78,10 @@ #' compare_performance(lm1, lm2, lm3) #' compare_performance(lm1, lm2, lm3, rank = TRUE) #' -#' if (require("lme4")) { -#' m1 <- lm(mpg ~ wt + cyl, data = mtcars) -#' m2 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") -#' m3 <- lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) -#' compare_performance(m1, m2, m3) -#' } +#' m1 <- lm(mpg ~ wt + cyl, data = mtcars) +#' m2 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") +#' m3 <- lme4::lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) +#' compare_performance(m1, m2, m3) #' @inheritParams model_performance.lm #' @export compare_performance <- function(..., metrics = "all", rank = FALSE, estimator = "ML", verbose = TRUE) { diff --git a/R/icc.R b/R/icc.R index 8ce19288e..16821e85f 100644 --- a/R/icc.R +++ b/R/icc.R @@ -143,29 +143,25 @@ #' very large, the variance ratio in the output makes no sense, e.g. because #' it is negative. In such cases, it might help to use `robust = TRUE`. #' -#' @examples -#' if (require("lme4")) { -#' model <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) -#' icc(model) -#' } +#' @examplesIf require("lme4") +#' model <- lme4::lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) +#' icc(model) #' #' # ICC for specific group-levels -#' if (require("lme4")) { -#' data(sleepstudy) -#' set.seed(12345) -#' sleepstudy$grp <- sample(1:5, size = 180, replace = TRUE) -#' sleepstudy$subgrp <- NA -#' for (i in 1:5) { -#' filter_group <- sleepstudy$grp == i -#' sleepstudy$subgrp[filter_group] <- -#' sample(1:30, size = sum(filter_group), replace = TRUE) -#' } -#' model <- lmer( -#' Reaction ~ Days + (1 | grp / subgrp) + (1 | Subject), -#' data = sleepstudy -#' ) -#' icc(model, by_group = TRUE) +#' data(sleepstudy, package = "lme4") +#' set.seed(12345) +#' sleepstudy$grp <- sample(1:5, size = 180, replace = TRUE) +#' sleepstudy$subgrp <- NA +#' for (i in 1:5) { +#' filter_group <- sleepstudy$grp == i +#' sleepstudy$subgrp[filter_group] <- +#' sample(1:30, size = sum(filter_group), replace = TRUE) #' } +#' model <- lme4::lmer( +#' Reaction ~ Days + (1 | grp / subgrp) + (1 | Subject), +#' data = sleepstudy +#' ) +#' icc(model, by_group = TRUE) #' @export icc <- function(model, by_group = FALSE, @@ -357,8 +353,8 @@ variance_decomposition <- function(model, result <- structure( class = "icc_decomposed", list( - "ICC_decomposed" = 1 - fun(var_icc), - "ICC_CI" = ci_icc + ICC_decomposed = 1 - fun(var_icc), + ICC_CI = ci_icc ) ) diff --git a/R/item_intercor.R b/R/item_intercor.R index 41b29b920..56aae1d77 100644 --- a/R/item_intercor.R +++ b/R/item_intercor.R @@ -7,25 +7,24 @@ #' @param x A matrix as returned by the `cor()`-function, #' or a data frame with items (e.g. from a test or questionnaire). #' @param method Correlation computation method. May be one of -#' `"spearman"` (default), `"pearson"` or `"kendall"`. +#' `"pearson"` (default), `"spearman"` or `"kendall"`. #' You may use initial letter only. #' #' @return The mean inter-item-correlation value for `x`. #' -#' @details This function calculates a mean inter-item-correlation, i.e. -#' a correlation matrix of `x` will be computed (unless -#' `x` is already a matrix as returned by the `cor()`-function) -#' and the mean of the sum of all item's correlation values is returned. -#' Requires either a data frame or a computed `cor()`-object. -#' \cr \cr -#' \dQuote{Ideally, the average inter-item correlation for a set of -#' items should be between .20 and .40, suggesting that while the -#' items are reasonably homogeneous, they do contain sufficiently -#' unique variance so as to not be isomorphic with each other. -#' When values are lower than .20, then the items may not be -#' representative of the same content domain. If values are higher than -#' .40, the items may be only capturing a small bandwidth of the construct.} -#' \cite{(Piedmont 2014)} +#' @details This function calculates a mean inter-item-correlation, i.e. a +#' correlation matrix of `x` will be computed (unless `x` is already a matrix +#' as returned by the `cor()` function) and the mean of the sum of all items' +#' correlation values is returned. Requires either a data frame or a computed +#' `cor()` object. +#' +#' "Ideally, the average inter-item correlation for a set of items should be +#' between 0.20 and 0.40, suggesting that while the items are reasonably +#' homogeneous, they do contain sufficiently unique variance so as to not be +#' isomorphic with each other. When values are lower than 0.20, then the items +#' may not be representative of the same content domain. If values are higher +#' than 0.40, the items may be only capturing a small bandwidth of the +#' construct." _(Piedmont 2014)_ #' #' @references #' Piedmont RL. 2014. Inter-item Correlations. In: Michalos AC (eds) diff --git a/R/looic.R b/R/looic.R index 040d2d410..8f0a0c66e 100644 --- a/R/looic.R +++ b/R/looic.R @@ -11,11 +11,15 @@ #' #' @return A list with four elements, the ELPD, LOOIC and their standard errors. #' -#' @examples -#' if (require("rstanarm")) { -#' model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) -#' looic(model) -#' } +#' @examplesIf require("rstanarm") +#' model <- suppressWarnings(rstanarm::stan_glm( +#' mpg ~ wt + cyl, +#' data = mtcars, +#' chains = 1, +#' iter = 500, +#' refresh = 0 +#' )) +#' looic(model) #' @export looic <- function(model, verbose = TRUE) { insight::check_if_installed("loo") diff --git a/R/model_performance.bayesian.R b/R/model_performance.bayesian.R index 66b5f936c..fd796fb60 100644 --- a/R/model_performance.bayesian.R +++ b/R/model_performance.bayesian.R @@ -40,30 +40,31 @@ #' #' - **PCP**: percentage of correct predictions, see [performance_pcp()]. #' -#' @examples -#' \dontrun{ -#' if (require("rstanarm") && require("rstantools")) { -#' model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) -#' model_performance(model) +#' @examplesIf require("rstanarm") && require("rstantools") && require("BayesFactor") +#' \donttest{ +#' model <- suppressWarnings(rstanarm::stan_glm( +#' mpg ~ wt + cyl, +#' data = mtcars, +#' chains = 1, +#' iter = 500, +#' refresh = 0 +#' )) +#' model_performance(model) #' -#' model <- stan_glmer( -#' mpg ~ wt + cyl + (1 | gear), -#' data = mtcars, -#' chains = 1, -#' iter = 500, -#' refresh = 0 -#' ) -#' model_performance(model) -#' } -#' -#' if (require("BayesFactor") && require("rstantools")) { -#' model <- generalTestBF(carb ~ am + mpg, mtcars) +#' model <- suppressWarnings(rstanarm::stan_glmer( +#' mpg ~ wt + cyl + (1 | gear), +#' data = mtcars, +#' chains = 1, +#' iter = 500, +#' refresh = 0 +#' )) +#' model_performance(model) #' -#' model_performance(model) -#' model_performance(model[3]) +#' model <- BayesFactor::generalTestBF(carb ~ am + mpg, mtcars) #' -#' model_performance(model, average = TRUE) -#' } +#' model_performance(model) +#' model_performance(model[3]) +#' model_performance(model, average = TRUE) #' } #' @seealso [r2_bayes] #' @references Gelman, A., Goodrich, B., Gabry, J., and Vehtari, A. (2018). @@ -252,7 +253,7 @@ model_performance.BFBayesFactor <- function(model, out <- list() attri <- list() - if ("R2" %in% c(metrics)) { + if ("R2" %in% metrics) { r2 <- r2_bayes(model, average = average, prior_odds = prior_odds, verbose = verbose) attri$r2_bayes <- attributes(r2) # save attributes diff --git a/R/model_performance.lavaan.R b/R/model_performance.lavaan.R index 9822c1c2b..050dbd473 100644 --- a/R/model_performance.lavaan.R +++ b/R/model_performance.lavaan.R @@ -1,15 +1,15 @@ #' Performance of lavaan SEM / CFA Models #' #' Compute indices of model performance for SEM or CFA models from the -#' \pkg{lavaan} package. +#' **lavaan** package. #' -#' @param model A \pkg{lavaan} model. +#' @param model A **lavaan** model. #' @param metrics Can be `"all"` or a character vector of metrics to be -#' computed (some of `c("Chi2", "Chi2_df", "p_Chi2", "Baseline", -#' "Baseline_df", "p_Baseline", "GFI", "AGFI", "NFI", "NNFI", "CFI", -#' "RMSEA", "RMSEA_CI_low", "RMSEA_CI_high", "p_RMSEA", "RMR", "SRMR", -#' "RFI", "PNFI", "IFI", "RNI", "Loglikelihood", "AIC", "BIC", -#' "BIC_adjusted")`). +#' computed (some of `"Chi2"`, `"Chi2_df"`, `"p_Chi2"`, `"Baseline"`, +#' `"Baseline_df"`, `"p_Baseline"`, `"GFI"`, `"AGFI"`, `"NFI"`, `"NNFI"`, +#' `"CFI"`, `"RMSEA"`, `"RMSEA_CI_low"`, `"RMSEA_CI_high"`, `"p_RMSEA"`, +#' `"RMR"`, `"SRMR"`, `"RFI"`, `"PNFI"`, `"IFI"`, `"RNI"`, `"Loglikelihood"`, +#' `"AIC"`, `"BIC"`, and `"BIC_adjusted"`. #' @param verbose Toggle off warnings. #' @param ... Arguments passed to or from other methods. #' @@ -70,15 +70,14 @@ #' and the **SRMR**. #' } #' -#' @examples +#' @examplesIf require("lavaan") #' # Confirmatory Factor Analysis (CFA) --------- -#' if (require("lavaan")) { -#' structure <- " visual =~ x1 + x2 + x3 -#' textual =~ x4 + x5 + x6 -#' speed =~ x7 + x8 + x9 " -#' model <- lavaan::cfa(structure, data = HolzingerSwineford1939) -#' model_performance(model) -#' } +#' data(HolzingerSwineford1939, package = "lavaan") +#' structure <- " visual =~ x1 + x2 + x3 +#' textual =~ x4 + x5 + x6 +#' speed =~ x7 + x8 + x9 " +#' model <- lavaan::cfa(structure, data = HolzingerSwineford1939) +#' model_performance(model) #' #' @references #' @@ -113,31 +112,31 @@ model_performance.lavaan <- function(model, metrics = "all", verbose = TRUE, ... row.names(measures) <- NULL out <- data.frame( - "Chi2" = measures$chisq, - "Chi2_df" = measures$df, - "p_Chi2" = measures$pvalue, - "Baseline" = measures$baseline.chisq, - "Baseline_df" = measures$baseline.df, - "p_Baseline" = measures$baseline.pvalue, - "GFI" = measures$gfi, - "AGFI" = measures$agfi, - "NFI" = measures$nfi, - "NNFI" = measures$tli, - "CFI" = measures$cfi, - "RMSEA" = measures$rmsea, - "RMSEA_CI_low" = measures$rmsea.ci.lower, - "RMSEA_CI_high" = measures$rmsea.ci.upper, - "p_RMSEA" = measures$rmsea.pvalue, - "RMR" = measures$rmr, - "SRMR" = measures$srmr, - "RFI" = measures$rfi, - "PNFI" = measures$pnfi, - "IFI" = measures$ifi, - "RNI" = measures$rni, - "Loglikelihood" = measures$logl, - "AIC" = measures$aic, - "BIC" = measures$bic, - "BIC_adjusted" = measures$bic2 + Chi2 = measures$chisq, + Chi2_df = measures$df, + p_Chi2 = measures$pvalue, + Baseline = measures$baseline.chisq, + Baseline_df = measures$baseline.df, + p_Baseline = measures$baseline.pvalue, + GFI = measures$gfi, + AGFI = measures$agfi, + NFI = measures$nfi, + NNFI = measures$tli, + CFI = measures$cfi, + RMSEA = measures$rmsea, + RMSEA_CI_low = measures$rmsea.ci.lower, + RMSEA_CI_high = measures$rmsea.ci.upper, + p_RMSEA = measures$rmsea.pvalue, + RMR = measures$rmr, + SRMR = measures$srmr, + RFI = measures$rfi, + PNFI = measures$pnfi, + IFI = measures$ifi, + RNI = measures$rni, + Loglikelihood = measures$logl, + AIC = measures$aic, + BIC = measures$bic, + BIC_adjusted = measures$bic2 ) if (all(metrics == "all")) { @@ -167,22 +166,22 @@ model_performance.blavaan <- function(model, metrics = "all", verbose = TRUE, .. row.names(measures) <- NULL out <- data.frame( - "BRMSEA" = fitind[1, "EAP"], - "SD_BRMSEA" = fitind[1, "SD"], - "BGammaHat" = fitind[2, "EAP"], - "SD_BGammaHat" = fitind[2, "SD"], - "Adj_BGammaHat" = fitind[3, "EAP"], - "SD_Adj_BGammaHat" = fitind[3, "SD"], - "Loglikelihood" = measures$logl, - "BIC" = measures$bic, - "DIC" = measures$dic, - "p_DIC" = measures$p_dic, - "WAIC" = measures$waic, - "SE_WAIC" = measures$se_waic, - "p_WAIC" = measures$p_waic, - "LOOIC" = measures$looic, - "SE_LOOIC" = measures$se_loo, - "p_LOOIC" = measures$p_loo + BRMSEA = fitind[1, "EAP"], + SD_BRMSEA = fitind[1, "SD"], + BGammaHat = fitind[2, "EAP"], + SD_BGammaHat = fitind[2, "SD"], + Adj_BGammaHat = fitind[3, "EAP"], + SD_Adj_BGammaHat = fitind[3, "SD"], + Loglikelihood = measures$logl, + BIC = measures$bic, + DIC = measures$dic, + p_DIC = measures$p_dic, + WAIC = measures$waic, + SE_WAIC = measures$se_waic, + p_WAIC = measures$p_waic, + LOOIC = measures$looic, + SE_LOOIC = measures$se_loo, + p_LOOIC = measures$p_loo ) if (all(metrics == "all")) { diff --git a/R/model_performance.mixed.R b/R/model_performance.mixed.R index 2c9d0ea1b..499196dab 100644 --- a/R/model_performance.mixed.R +++ b/R/model_performance.mixed.R @@ -35,11 +35,9 @@ #' on returned indices. #' } #' -#' @examples -#' if (require("lme4")) { -#' model <- lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) -#' model_performance(model) -#' } +#' @examplesIf require("lme4") +#' model <- lme4::lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) +#' model_performance(model) #' @export model_performance.merMod <- function(model, metrics = "all", diff --git a/R/model_performance.rma.R b/R/model_performance.rma.R index 2730e5e5b..6a3fb2e93 100644 --- a/R/model_performance.rma.R +++ b/R/model_performance.rma.R @@ -47,13 +47,18 @@ #' See the documentation for `?metafor::fitstats`. #' } #' -#' @examples -#' if (require("metafor")) { -#' data(dat.bcg) -#' dat <- escalc(measure = "RR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg) -#' model <- rma(yi, vi, data = dat, method = "REML") -#' model_performance(model) -#' } +#' @examplesIf require("metafor") && require("metadat") +#' data(dat.bcg, package = "metadat") +#' dat <- metafor::escalc( +#' measure = "RR", +#' ai = tpos, +#' bi = tneg, +#' ci = cpos, +#' di = cneg, +#' data = dat.bcg +#' ) +#' model <- metafor::rma(yi, vi, data = dat, method = "REML") +#' model_performance(model) #' @export model_performance.rma <- function(model, metrics = "all", estimator = "ML", verbose = TRUE, ...) { if (all(metrics == "all")) { diff --git a/R/performance_accuracy.R b/R/performance_accuracy.R index ffc245093..d0adb5554 100644 --- a/R/performance_accuracy.R +++ b/R/performance_accuracy.R @@ -12,6 +12,7 @@ #' compute the accuracy values. #' @param n Number of bootstrap-samples. #' @param verbose Toggle warnings. +#' @inheritParams performance_pcp #' #' @return A list with three values: The `Accuracy` of the model #' predictions, i.e. the proportion of accurately predicted values from the @@ -40,6 +41,7 @@ performance_accuracy <- function(model, method = c("cv", "boot"), k = 5, n = 1000, + ci = 0.95, verbose = TRUE) { method <- match.arg(method) @@ -186,6 +188,9 @@ performance_accuracy <- function(model, list( Accuracy = mean(accuracy, na.rm = TRUE), SE = stats::sd(accuracy, na.rm = TRUE), + CI = ci, + CI_low = as.vector(stats::quantile(accuracy, 1 - ((1 + ci) / 2), na.rm = TRUE)), + CI_high = as.vector(stats::quantile(accuracy, (1 + ci) / 2, na.rm = TRUE)), Method = measure ) ) @@ -199,6 +204,9 @@ as.data.frame.performance_accuracy <- function(x, row.names = NULL, ...) { data.frame( Accuracy = x$Accuracy, SE = x$SE, + CI = x$CI, + CI_low = x$CI_low, + CI_high = x$CI_high, Method = x$Method, stringsAsFactors = FALSE, row.names = row.names, @@ -213,9 +221,14 @@ print.performance_accuracy <- function(x, ...) { insight::print_color("# Accuracy of Model Predictions\n\n", "blue") # statistics - cat(sprintf("Accuracy: %.2f%%\n", 100 * x$Accuracy)) - cat(sprintf(" SE: %.2f%%-points\n", 100 * x$SE)) - cat(sprintf(" Method: %s\n", x$Method)) + cat(sprintf( + "Accuracy (%i%% CI): %.2f%% [%.2f%%, %.2f%%]\nMethod: %s\n", + round(100 * x$CI), + 100 * x$Accuracy, + 100 * x$CI_low, + 100 * x$CI_high, + x$Method + )) invisible(x) } diff --git a/R/performance_rmse.R b/R/performance_rmse.R index b5044044f..0cc5eac90 100644 --- a/R/performance_rmse.R +++ b/R/performance_rmse.R @@ -20,16 +20,15 @@ #' #' @return Numeric, the root mean squared error. #' -#' @examples -#' if (require("nlme")) { -#' m <- lme(distance ~ age, data = Orthodont) +#' @examplesIf require("nlme") +#' data(Orthodont, package = "nlme") +#' m <- nlme::lme(distance ~ age, data = Orthodont) #' -#' # RMSE -#' performance_rmse(m, normalized = FALSE) +#' # RMSE +#' performance_rmse(m, normalized = FALSE) #' -#' # normalized RMSE -#' performance_rmse(m, normalized = TRUE) -#' } +#' # normalized RMSE +#' performance_rmse(m, normalized = TRUE) #' @export performance_rmse <- function(model, normalized = FALSE, verbose = TRUE) { tryCatch( diff --git a/R/performance_score.R b/R/performance_score.R index 7c606c71f..eb5ee9b31 100644 --- a/R/performance_score.R +++ b/R/performance_score.R @@ -32,7 +32,7 @@ #' #' @seealso [`performance_logloss()`] #' -#' @examples +#' @examplesIf require("glmmTMB") #' ## Dobson (1990) Page 93: Randomized Controlled Trial : #' counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) #' outcome <- gl(3, 1, 9) @@ -40,18 +40,16 @@ #' model <- glm(counts ~ outcome + treatment, family = poisson()) #' #' performance_score(model) -#' \dontrun{ -#' if (require("glmmTMB")) { -#' data(Salamanders) -#' model <- glmmTMB( -#' count ~ spp + mined + (1 | site), -#' zi = ~ spp + mined, -#' family = nbinom2(), -#' data = Salamanders -#' ) +#' \donttest{ +#' data(Salamanders, package = "glmmTMB") +#' model <- glmmTMB::glmmTMB( +#' count ~ spp + mined + (1 | site), +#' zi = ~ spp + mined, +#' family = nbinom2(), +#' data = Salamanders +#' ) #' -#' performance_score(model) -#' } +#' performance_score(model) #' } #' @export performance_score <- function(model, verbose = TRUE, ...) { diff --git a/R/r2.R b/R/r2.R index baf3a18a0..26a16f9e6 100644 --- a/R/r2.R +++ b/R/r2.R @@ -32,7 +32,7 @@ #' [`r2_nakagawa()`], [`r2_tjur()`], [`r2_xu()`] and #' [`r2_zeroinflated()`]. #' -#' @examples +#' @examplesIf require("lme4") #' # Pseudo r-quared for GLM #' model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") #' r2(model) @@ -41,10 +41,8 @@ #' model <- lm(mpg ~ wt + hp, data = mtcars) #' r2(model, ci = 0.95) #' -#' if (require("lme4")) { -#' model <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) -#' r2(model) -#' } +#' model <- lme4::lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) +#' r2(model) #' @export r2 <- function(model, ...) { UseMethod("r2") diff --git a/R/r2_bayes.R b/R/r2_bayes.R index cbdec33d5..fc007489c 100644 --- a/R/r2_bayes.R +++ b/R/r2_bayes.R @@ -33,16 +33,23 @@ #' @examples #' library(performance) #' if (require("rstanarm") && require("rstantools")) { -#' model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) +#' model <- suppressWarnings(stan_glm( +#' mpg ~ wt + cyl, +#' data = mtcars, +#' chains = 1, +#' iter = 500, +#' refresh = 0, +#' show_messages = FALSE +#' )) #' r2_bayes(model) #' -#' model <- stan_lmer( +#' model <- suppressWarnings(stan_lmer( #' Petal.Length ~ Petal.Width + (1 | Species), #' data = iris, #' chains = 1, #' iter = 500, #' refresh = 0 -#' ) +#' )) #' r2_bayes(model) #' } #' @@ -66,12 +73,22 @@ #' r2_bayes(model) #' } #' -#' \dontrun{ +#' \donttest{ #' if (require("brms")) { -#' model <- brms::brm(mpg ~ wt + cyl, data = mtcars) +#' model <- suppressWarnings(brms::brm( +#' mpg ~ wt + cyl, +#' data = mtcars, +#' silent = 2, +#' refresh = 0 +#' )) #' r2_bayes(model) #' -#' model <- brms::brm(Petal.Length ~ Petal.Width + (1 | Species), data = iris) +#' model <- suppressWarnings(brms::brm( +#' Petal.Length ~ Petal.Width + (1 | Species), +#' data = iris, +#' silent = 2, +#' refresh = 0 +#' )) #' r2_bayes(model) #' } #' } @@ -424,10 +441,10 @@ as.data.frame.r2_bayes <- function(x, ...) { residuals.BFBayesFactor <- function(object, ...) { everything_we_need <- .get_bfbf_predictions(object, verbose = FALSE) - everything_we_need[["y"]] - apply(everything_we_need[["y_pred"]], 2, mean) + everything_we_need[["y"]] - colMeans(everything_we_need[["y_pred"]]) } #' @export fitted.BFBayesFactor <- function(object, ...) { - apply(.get_bfbf_predictions(object, verbose = FALSE)[["y_pred"]], 2, mean) + colMeans(.get_bfbf_predictions(object, verbose = FALSE)[["y_pred"]]) } diff --git a/R/r2_loo.R b/R/r2_loo.R index 1460d21d9..040d9b572 100644 --- a/R/r2_loo.R +++ b/R/r2_loo.R @@ -20,20 +20,25 @@ #' leave-one-out-adjusted posterior distribution. This is conceptually similar #' to an adjusted/unbiased R2 estimate in classical regression modeling. See #' [r2_bayes()] for an "unadjusted" R2. -#' \cr \cr +#' #' Mixed models are not currently fully supported. -#' \cr \cr +#' #' `r2_loo_posterior()` is the actual workhorse for `r2_loo()` and #' returns a posterior sample of LOO-adjusted Bayesian R2 values. #' #' @return A list with the LOO-adjusted R2 value. The standard errors #' and credible intervals for the R2 values are saved as attributes. #' -#' @examples -#' if (require("rstanarm")) { -#' model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) -#' r2_loo(model) -#' } +#' @examplesIf require("rstanarm") && require("rstantools") +#' model <- suppressWarnings(rstanarm::stan_glm( +#' mpg ~ wt + cyl, +#' data = mtcars, +#' chains = 1, +#' iter = 500, +#' refresh = 0, +#' show_messages = FALSE +#' )) +#' r2_loo(model) #' @export r2_loo <- function(model, robust = TRUE, ci = 0.95, verbose = TRUE, ...) { loo_r2 <- r2_loo_posterior(model, verbose = verbose, ...) diff --git a/R/r2_nakagawa.R b/R/r2_nakagawa.R index cdf1c8e02..9b751c843 100644 --- a/R/r2_nakagawa.R +++ b/R/r2_nakagawa.R @@ -46,12 +46,10 @@ #' generalized linear mixed-effects models revisited and expanded. Journal of #' The Royal Society Interface, 14(134), 20170213. #' -#' @examples -#' if (require("lme4")) { -#' model <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) -#' r2_nakagawa(model) -#' r2_nakagawa(model, by_group = TRUE) -#' } +#' @examplesIf require("lme4") +#' model <- lme4::lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) +#' r2_nakagawa(model) +#' r2_nakagawa(model, by_group = TRUE) #' @export r2_nakagawa <- function(model, by_group = FALSE, diff --git a/R/r2_somers.R b/R/r2_somers.R index 3f7fc4d5f..b39963422 100644 --- a/R/r2_somers.R +++ b/R/r2_somers.R @@ -8,7 +8,7 @@ #' @return A named vector with the R2 value. #' #' @examples -#' \dontrun{ +#' \donttest{ #' if (require("correlation") && require("Hmisc")) { #' model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") #' r2_somers(model) diff --git a/R/test_bf.R b/R/test_bf.R index f273bb390..b50f0f652 100644 --- a/R/test_bf.R +++ b/R/test_bf.R @@ -25,7 +25,7 @@ test_bf.default <- function(..., reference = 1, text_length = NULL) { if (inherits(objects, c("ListNestedRegressions", "ListNonNestedRegressions", "ListLavaan"))) { test_bf(objects, reference = reference, text_length = text_length) } else { - stop("The models cannot be compared for some reason :/", call. = FALSE) + insight::format_error("The models cannot be compared for some reason :/") } } diff --git a/R/test_performance.R b/R/test_performance.R index 51627c66b..4ad0c18f2 100644 --- a/R/test_performance.R +++ b/R/test_performance.R @@ -32,11 +32,11 @@ #' ## Nested vs. Non-nested Models #' Model's "nesting" is an important concept of models comparison. Indeed, many #' tests only make sense when the models are *"nested",* i.e., when their -#' predictors are nested. This means that all the predictors of a model are -#' contained within the predictors of a larger model (sometimes referred to as -#' the encompassing model). For instance, `model1 (y ~ x1 + x2)` is -#' "nested" within `model2 (y ~ x1 + x2 + x3)`. Usually, people have a list -#' of nested models, for instance `m1 (y ~ 1)`, `m2 (y ~ x1)`, +#' predictors are nested. This means that all the *fixed effects* predictors of +#' a model are contained within the *fixed effects* predictors of a larger model +#' (sometimes referred to as the encompassing model). For instance, +#' `model1 (y ~ x1 + x2)` is "nested" within `model2 (y ~ x1 + x2 + x3)`. Usually, +#' people have a list of nested models, for instance `m1 (y ~ 1)`, `m2 (y ~ x1)`, #' `m3 (y ~ x1 + x2)`, `m4 (y ~ x1 + x2 + x3)`, and it is conventional #' that they are "ordered" from the smallest to largest, but it is up to the #' user to reverse the order from largest to smallest. The test then shows @@ -268,11 +268,11 @@ plot.test_performance <- function(x, ...) { #' @export format.test_performance <- function(x, digits = 2, ...) { # Format cols and names - out <- insight::format_table(x, digits = digits, ...) + out <- insight::format_table(x, digits = digits, exact = FALSE, ...) if (isTRUE(attributes(x)$is_nested)) { footer <- paste0( - "Models were detected as nested and are compared in sequential order.\n" + "Models were detected as nested (in terms of fixed parameters) and are compared in sequential order.\n" ) } else { footer <- paste0( diff --git a/README.Rmd b/README.Rmd index 771aab779..895fa02a8 100644 --- a/README.Rmd +++ b/README.Rmd @@ -333,6 +333,11 @@ test_performance(lm1, lm2, lm3, lm4) test_bf(lm1, lm2, lm3, lm4) ``` +### Plotting Functions + +Plotting functions are available through the [**see** package](https://easystats.github.io/see/articles/performance.html). + + # Code of Conduct Please note that the performance project is released with a [Contributor Code of Conduct](https://easystats.github.io/performance/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms. diff --git a/README.md b/README.md index 95189de17..3be782f33 100644 --- a/README.md +++ b/README.md @@ -147,8 +147,8 @@ model <- stan_glmer( r2(model) #> # Bayesian R2 with Compatibility Interval #> -#> Conditional R2: 0.953 (95% CI [0.941, 0.963]) -#> Marginal R2: 0.824 (95% CI [0.713, 0.896]) +#> Conditional R2: 0.953 (95% CI [0.942, 0.963]) +#> Marginal R2: 0.824 (95% CI [0.721, 0.899]) library(lme4) model <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) @@ -442,6 +442,11 @@ test_bf(lm1, lm2, lm3, lm4) #> * Bayes Factor Type: BIC approximation ``` +### Plotting Functions + +Plotting functions are available through the [**see** +package](https://easystats.github.io/see/articles/performance.html). + # Code of Conduct Please note that the performance project is released with a [Contributor diff --git a/cran-comments.md b/cran-comments.md index 5909703f7..d044c232a 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -1 +1 @@ -This update fixes reverse-dependency issues from the *parameters* package. We checked all reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package and saw no new problems. \ No newline at end of file +This release fixes CRAN check errors. We checked all reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package and saw no new problems. \ No newline at end of file diff --git a/inst/WORDLIST b/inst/WORDLIST index 3445d1822..adf602827 100644 --- a/inst/WORDLIST +++ b/inst/WORDLIST @@ -1,3 +1,4 @@ +ACM AGFI AICc Agresti @@ -12,6 +13,7 @@ BFBayesFactor BMJ Baayen BayesFactor +Benthem Betancourt Bezdek Biometrics @@ -41,6 +43,7 @@ Csaki DBSCAN DOI Datenerhebung +Delacre Deskriptivstatistische Distinguishability Dom @@ -94,16 +97,20 @@ Iglewicz Intra Intraclass Itemanalyse +JB JM Jackman Jurs +KJ KMO Kelava Kettenring Kiado Killeen Kliegl +Kristensen Kullback +Lakens LOF LOGLOSS LOOIC @@ -119,6 +126,7 @@ Lomax MSA Maddala Magee +Magnusson Mahwah Marcoulides Mattan @@ -129,6 +137,7 @@ Merkle Methoden Michalos Moosbrugger +Mora Multicollinearity NFI NNFI @@ -141,8 +150,11 @@ Normed ORCID Olkin PNFI +Pek Petrov Postestimation +Pre +Psychol Psychometrika QE RFI @@ -196,9 +208,11 @@ Vuong Vuong's WAIC WMK +Weisberg Windmeijer Witten Xu +YL Zavoina Zavoinas Zhou @@ -233,6 +247,8 @@ easystats et explicitely favour +fixest +fpsyg gam geoms ggplot @@ -275,6 +291,7 @@ overfitted patilindrajeets poisson preprint +pscl quared quartile quartiles diff --git a/man/binned_residuals.Rd b/man/binned_residuals.Rd index 254e19492..ff6fb5784 100644 --- a/man/binned_residuals.Rd +++ b/man/binned_residuals.Rd @@ -57,11 +57,13 @@ result # look at the data frame as.data.frame(result) -\dontrun{ +\donttest{ # plot if (require("see")) { - plot(result) -}} + plot(result, show_dots = TRUE) +} +} + } \references{ Gelman, A., and Hill, J. (2007). Data analysis using regression and diff --git a/man/check_collinearity.Rd b/man/check_collinearity.Rd index 7a32f2221..9b943758d 100644 --- a/man/check_collinearity.Rd +++ b/man/check_collinearity.Rd @@ -153,7 +153,7 @@ Methods. Educational and Psychological Measurement, 79(5), 874–882. \item McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan. 2nd edition. Chapman and Hall/CRC. \item Vanhove, J. (2019). Collinearity isn't a disease that needs curing. -\href{https://janhove.github.io/analysis/2019/09/11/collinearity}{webpage} +\href{https://janhove.github.io/posts/2019-09-11-collinearity/}{webpage} \item Zuur AF, Ieno EN, Elphick CS. A protocol for data exploration to avoid common statistical problems: Data exploration. Methods in Ecology and Evolution (2010) 1:3–14. diff --git a/man/check_convergence.Rd b/man/check_convergence.Rd index 501b66704..12c181a14 100644 --- a/man/check_convergence.Rd +++ b/man/check_convergence.Rd @@ -63,31 +63,29 @@ or not. } \examples{ -if (require("lme4")) { - data(cbpp) - set.seed(1) - cbpp$x <- rnorm(nrow(cbpp)) - cbpp$x2 <- runif(nrow(cbpp)) +\dontshow{if (require("lme4") && require("glmmTMB")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +data(cbpp, package = "lme4") +set.seed(1) +cbpp$x <- rnorm(nrow(cbpp)) +cbpp$x2 <- runif(nrow(cbpp)) - model <- glmer( - cbind(incidence, size - incidence) ~ period + x + x2 + (1 + x | herd), - data = cbpp, - family = binomial() - ) +model <- lme4::glmer( + cbind(incidence, size - incidence) ~ period + x + x2 + (1 + x | herd), + data = cbpp, + family = binomial() +) - check_convergence(model) -} +check_convergence(model) -\dontrun{ -if (require("glmmTMB")) { - model <- glmmTMB( - Sepal.Length ~ poly(Petal.Width, 4) * poly(Petal.Length, 4) + - (1 + poly(Petal.Width, 4) | Species), - data = iris - ) - check_convergence(model) -} +\donttest{ +model <- suppressWarnings(glmmTMB::glmmTMB( + Sepal.Length ~ poly(Petal.Width, 4) * poly(Petal.Length, 4) + + (1 + poly(Petal.Width, 4) | Species), + data = iris +)) +check_convergence(model) } +\dontshow{\}) # examplesIf} } \seealso{ Other functions to check model assumptions and and assess model quality: diff --git a/man/check_distribution.Rd b/man/check_distribution.Rd index 83ece551a..d8fd6c949 100644 --- a/man/check_distribution.Rd +++ b/man/check_distribution.Rd @@ -45,12 +45,12 @@ implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. } \examples{ -if (require("lme4") && require("parameters") && - require("see") && require("patchwork") && require("randomForest")) { - data(sleepstudy) - - model <<- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) - check_distribution(model) - plot(check_distribution(model)) -} +\dontshow{if (require("lme4") && require("parameters") && require("randomForest")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +data(sleepstudy, package = "lme4") +model <<- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy) +check_distribution(model) +\dontshow{\}) # examplesIf} +\dontshow{if (require("see") && require("patchwork") && require("randomForest")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +plot(check_distribution(model)) +\dontshow{\}) # examplesIf} } diff --git a/man/check_heterogeneity_bias.Rd b/man/check_heterogeneity_bias.Rd index 9795aff22..20c0bba4c 100644 --- a/man/check_heterogeneity_bias.Rd +++ b/man/check_heterogeneity_bias.Rd @@ -20,15 +20,22 @@ argument will be ignored.} \description{ \code{check_heterogeneity_bias()} checks if model predictors or variables may cause a heterogeneity bias, i.e. if variables have a within- and/or -between-effect. +between-effect (\emph{Bell and Jones, 2015}). } \examples{ data(iris) iris$ID <- sample(1:4, nrow(iris), replace = TRUE) # fake-ID check_heterogeneity_bias(iris, select = c("Sepal.Length", "Petal.Length"), group = "ID") } +\references{ +\itemize{ +\item Bell A, Jones K. 2015. Explaining Fixed Effects: Random Effects +Modeling of Time-Series Cross-Sectional and Panel Data. Political Science +Research and Methods, 3(1), 133–153. +} +} \seealso{ For further details, read the vignette \url{https://easystats.github.io/parameters/articles/demean.html} and also -see documentation for \code{?datawizard::demean}. +see documentation for \code{\link[datawizard:demean]{datawizard::demean()}}. } diff --git a/man/check_heteroscedasticity.Rd b/man/check_heteroscedasticity.Rd index f8a7ecad3..c5e1fba74 100644 --- a/man/check_heteroscedasticity.Rd +++ b/man/check_heteroscedasticity.Rd @@ -28,7 +28,8 @@ This test of the hypothesis of (non-)constant error is also called \emph{Breusch-Pagan test} (\cite{1979}). } \note{ -There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method} implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. +There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method} +implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. } \examples{ m <<- lm(mpg ~ wt + cyl + gear + disp, data = mtcars) @@ -41,7 +42,8 @@ if (require("see")) { } } \references{ -Breusch, T. S., and Pagan, A. R. (1979) A simple test for heteroscedasticity and random coefficient variation. Econometrica 47, 1287-1294. +Breusch, T. S., and Pagan, A. R. (1979) A simple test for heteroscedasticity +and random coefficient variation. Econometrica 47, 1287-1294. } \seealso{ Other functions to check model assumptions and and assess model quality: diff --git a/man/check_homogeneity.Rd b/man/check_homogeneity.Rd index b176530fc..2a9b94f02 100644 --- a/man/check_homogeneity.Rd +++ b/man/check_homogeneity.Rd @@ -31,7 +31,8 @@ Check model for homogeneity of variances between groups described by independent variables in a model. } \note{ -There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method} implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. +There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method} +implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. } \examples{ model <<- lm(len ~ supp + dose, data = ToothGrowth) diff --git a/man/check_itemscale.Rd b/man/check_itemscale.Rd index dc6128bbf..7fa487ab5 100644 --- a/man/check_itemscale.Rd +++ b/man/check_itemscale.Rd @@ -44,16 +44,17 @@ acceptability. Satisfactory range lies between 0.2 and 0.4. See also } } \examples{ +\dontshow{if (require("parameters") && require("psych")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} # data generation from '?prcomp', slightly modified C <- chol(S <- toeplitz(0.9^(0:15))) set.seed(17) X <- matrix(rnorm(1600), 100, 16) Z <- X \%*\% C -if (require("parameters") && require("psych")) { - pca <- principal_components(as.data.frame(Z), rotation = "varimax", n = 3) - pca - check_itemscale(pca) -} + +pca <- principal_components(as.data.frame(Z), rotation = "varimax", n = 3) +pca +check_itemscale(pca) +\dontshow{\}) # examplesIf} } \references{ \itemize{ diff --git a/man/check_model.Rd b/man/check_model.Rd index e8379fc9e..2bf82af92 100644 --- a/man/check_model.Rd +++ b/man/check_model.Rd @@ -17,9 +17,11 @@ check_model(x, ...) dot_alpha = 0.8, colors = c("#3aaf85", "#1b6ca8", "#cd201f"), theme = "see::theme_lucid", - detrend = FALSE, + detrend = TRUE, show_dots = NULL, - verbose = TRUE, + bandwidth = "nrd", + type = "density", + verbose = FALSE, ... ) } @@ -53,7 +55,8 @@ for dots, and third color for outliers or extreme values.} \item{theme}{String, indicating the name of the plot-theme. Must be in the format \code{"package::theme_name"} (e.g. \code{"ggplot2::theme_minimal"}).} -\item{detrend}{Should QQ/PP plots be detrended?} +\item{detrend}{Logical. Should Q-Q/P-P plots be detrended? Defaults to +\code{TRUE}.} \item{show_dots}{Logical, if \code{TRUE}, will show data points in the plot. Set to \code{FALSE} for models with many observations, if generating the plot is too @@ -61,7 +64,18 @@ time-consuming. By default, \code{show_dots = NULL}. In this case \code{check_mo tries to guess whether performance will be poor due to a very large model and thus automatically shows or hides dots.} -\item{verbose}{Toggle off warnings.} +\item{bandwidth}{A character string indicating the smoothing bandwidth to +be used. Unlike \code{stats::density()}, which used \code{"nrd0"} as default, the +default used here is \code{"nrd"} (which seems to give more plausible results +for non-Gaussian models). When problems with plotting occur, try to change +to a different value.} + +\item{type}{Plot type for the posterior predictive checks plot. Can be \code{"density"}, +\code{"discrete_dots"}, \code{"discrete_interval"} or \code{"discrete_both"} (the \verb{discrete_*} +options are appropriate for models with discrete - binary, integer or ordinal +etc. - outcomes).} + +\item{verbose}{If \code{FALSE} (default), suppress most warning messages.} } \value{ The data frame that is used for plotting. @@ -77,6 +91,8 @@ models will be "converted" to their frequentist counterpart, using \href{https://easystats.github.io/bayestestR/reference/convert_bayesian_as_frequentist.html}{\code{bayestestR::bayesian_as_frequentist}}. A more advanced model-check for Bayesian models will be implemented at a later stage. + +See also the related \href{https://easystats.github.io/performance/articles/check_model.html}{vignette}. } \note{ This function just prepares the data for plotting. To create the plots, @@ -190,20 +206,16 @@ skipped, which also increases performance. } \examples{ -\dontrun{ +\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ m <- lm(mpg ~ wt + cyl + gear + disp, data = mtcars) check_model(m) -if (require("lme4")) { - m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) - check_model(m, panel = FALSE) -} - -if (require("rstanarm")) { - m <- stan_glm(mpg ~ wt + gear, data = mtcars, chains = 2, iter = 200) - check_model(m) -} +data(sleepstudy, package = "lme4") +m <- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy) +check_model(m, panel = FALSE) } +\dontshow{\}) # examplesIf} } \seealso{ Other functions to check model assumptions and and assess model quality: diff --git a/man/check_multimodal.Rd b/man/check_multimodal.Rd index 43153734a..1fc7003cb 100644 --- a/man/check_multimodal.Rd +++ b/man/check_multimodal.Rd @@ -19,33 +19,31 @@ it always returns a significant result (suggesting that the distribution is multimodal). A better method might be needed here. } \examples{ -\dontrun{ -if (require("multimode")) { - # Univariate - x <- rnorm(1000) - check_multimodal(x) -} +\dontshow{if (require("multimode") && require("mclust")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +# Univariate +x <- rnorm(1000) +check_multimodal(x) -if (require("multimode") && require("mclust")) { - x <- c(rnorm(1000), rnorm(1000, 2)) - check_multimodal(x) +x <- c(rnorm(1000), rnorm(1000, 2)) +check_multimodal(x) - # Multivariate - m <- data.frame( - x = rnorm(200), - y = rbeta(200, 2, 1) - ) - plot(m$x, m$y) - check_multimodal(m) +# Multivariate +m <- data.frame( + x = rnorm(200), + y = rbeta(200, 2, 1) +) +plot(m$x, m$y) +check_multimodal(m) - m <- data.frame( - x = c(rnorm(100), rnorm(100, 4)), - y = c(rbeta(100, 2, 1), rbeta(100, 1, 4)) - ) - plot(m$x, m$y) - check_multimodal(m) -} -} +m <- data.frame( + x = c(rnorm(100), rnorm(100, 4)), + y = c(rbeta(100, 2, 1), rbeta(100, 1, 4)) +) +plot(m$x, m$y) +check_multimodal(m) +} +\dontshow{\}) # examplesIf} } \references{ \itemize{ diff --git a/man/check_normality.Rd b/man/check_normality.Rd index 69f3b273d..282aa6016 100644 --- a/man/check_normality.Rd +++ b/man/check_normality.Rd @@ -33,7 +33,7 @@ significant results for the distribution of residuals and visual inspection (e.g. Q-Q plots) are preferable. For generalized linear models, no formal statistical test is carried out. Rather, there's only a \code{plot()} method for GLMs. This plot shows a half-normal Q-Q plot of the absolute value of the -standardized deviance residuals is shown (being in line with changes in +standardized deviance residuals is shown (in line with changes in \code{plot.lm()} for R 4.3+). } \note{ @@ -43,19 +43,20 @@ standardized residuals, are used for the test. There is also a implemented in the \href{https://easystats.github.io/see/}{\strong{see}-package}. } \examples{ +\dontshow{if (require("see")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} m <<- lm(mpg ~ wt + cyl + gear + disp, data = mtcars) check_normality(m) # plot results -if (require("see")) { - x <- check_normality(m) - plot(x) -} -\dontrun{ +x <- check_normality(m) +plot(x) + +\donttest{ # QQ-plot plot(check_normality(m), type = "qq") # PP-plot plot(check_normality(m), type = "pp") } +\dontshow{\}) # examplesIf} } diff --git a/man/check_outliers.Rd b/man/check_outliers.Rd index 04999ebc6..e3c218b7a 100644 --- a/man/check_outliers.Rd +++ b/man/check_outliers.Rd @@ -290,7 +290,7 @@ filtered_data <- data[outliers_info$Outlier < 0.1, ] group_iris <- datawizard::data_group(iris, "Species") check_outliers(group_iris) -\dontrun{ +\donttest{ # You can also run all the methods check_outliers(data, method = "all") diff --git a/man/check_predictions.Rd b/man/check_predictions.Rd index c7f43a66f..591c813da 100644 --- a/man/check_predictions.Rd +++ b/man/check_predictions.Rd @@ -7,24 +7,28 @@ \alias{check_posterior_predictions} \title{Posterior predictive checks} \usage{ -check_predictions(object, iterations = 50, check_range = FALSE, ...) +check_predictions(object, ...) \method{check_predictions}{default}( object, iterations = 50, check_range = FALSE, re_formula = NULL, + bandwidth = "nrd", + type = "density", verbose = TRUE, ... ) -posterior_predictive_check(object, iterations = 50, check_range = FALSE, ...) +posterior_predictive_check(object, ...) -check_posterior_predictions(object, iterations = 50, check_range = FALSE, ...) +check_posterior_predictions(object, ...) } \arguments{ \item{object}{A statistical model.} +\item{...}{Passed down to \code{simulate()}.} + \item{iterations}{The number of draws to simulate/bootstrap.} \item{check_range}{Logical, if \code{TRUE}, includes a plot with the minimum @@ -34,13 +38,22 @@ the variation in the original data is captured by the model or not (\emph{Gelman et al. 2020, pp.163}). The minimum and maximum values of \code{y} should be inside the range of the related minimum and maximum values of \code{yrep}.} -\item{...}{Passed down to \code{simulate()}.} - \item{re_formula}{Formula containing group-level effects (random effects) to be considered in the simulated data. If \code{NULL} (default), condition on all random effects. If \code{NA} or \code{~0}, condition on no random effects. See \code{simulate()} in \strong{lme4}.} +\item{bandwidth}{A character string indicating the smoothing bandwidth to +be used. Unlike \code{stats::density()}, which used \code{"nrd0"} as default, the +default used here is \code{"nrd"} (which seems to give more plausible results +for non-Gaussian models). When problems with plotting occur, try to change +to a different value.} + +\item{type}{Plot type for the posterior predictive checks plot. Can be \code{"density"}, +\code{"discrete_dots"}, \code{"discrete_interval"} or \code{"discrete_both"} (the \verb{discrete_*} +options are appropriate for models with discrete - binary, integer or ordinal +etc. - outcomes).} + \item{verbose}{Toggle warnings.} } \value{ @@ -73,11 +86,22 @@ package that imports \strong{bayesplot} such as \strong{rstanarm} or \strong{brm is loaded, \code{pp_check()} is also available as an alias for \code{check_predictions()}. } \examples{ -library(performance) +\dontshow{if (require("see")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +# linear model model <- lm(mpg ~ disp, data = mtcars) -if (require("see")) { - check_predictions(model) -} +check_predictions(model) + +# discrete/integer outcome +set.seed(99) +d <- iris +d$skewed <- rpois(150, 1) +model <- glm( + skewed ~ Species + Petal.Length + Petal.Width, + family = poisson(), + data = d +) +check_predictions(model, type = "discrete_both") +\dontshow{\}) # examplesIf} } \references{ \itemize{ diff --git a/man/check_sphericity.Rd b/man/check_sphericity.Rd index 531b745a2..6aaa53b3b 100644 --- a/man/check_sphericity.Rd +++ b/man/check_sphericity.Rd @@ -20,12 +20,13 @@ Check model for violation of sphericity. For \link[=check_factorstructure]{Bartl (used for correlation matrices and factor analyses), see \link{check_sphericity_bartlett}. } \examples{ -if (require("car")) { - soils.mod <- lm( - cbind(pH, N, Dens, P, Ca, Mg, K, Na, Conduc) ~ Block + Contour * Depth, - data = Soils - ) +\dontshow{if (require("car") && require("carData")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +data(Soils, package = "carData") +soils.mod <- lm( + cbind(pH, N, Dens, P, Ca, Mg, K, Na, Conduc) ~ Block + Contour * Depth, + data = Soils +) - check_sphericity(Manova(soils.mod)) -} +check_sphericity(Manova(soils.mod)) +\dontshow{\}) # examplesIf} } diff --git a/man/check_symmetry.Rd b/man/check_symmetry.Rd index 99ee8953b..cb58d41d6 100644 --- a/man/check_symmetry.Rd +++ b/man/check_symmetry.Rd @@ -18,7 +18,7 @@ nonparametric skew (\eqn{\frac{(Mean - Median)}{SD}}) is different than 0. This is an underlying assumption of Wilcoxon signed-rank test. } \examples{ -V <- wilcox.test(mtcars$mpg) +V <- suppressWarnings(wilcox.test(mtcars$mpg)) check_symmetry(V) } diff --git a/man/check_zeroinflation.Rd b/man/check_zeroinflation.Rd index d0a62a76c..db9eddd23 100644 --- a/man/check_zeroinflation.Rd +++ b/man/check_zeroinflation.Rd @@ -30,11 +30,11 @@ zero-inflation in the data. In such cases, it is recommended to use negative binomial or zero-inflated models. } \examples{ -if (require("glmmTMB")) { - data(Salamanders) - m <- glm(count ~ spp + mined, family = poisson, data = Salamanders) - check_zeroinflation(m) -} +\dontshow{if (require("glmmTMB")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +data(Salamanders, package = "glmmTMB") +m <- glm(count ~ spp + mined, family = poisson, data = Salamanders) +check_zeroinflation(m) +\dontshow{\}) # examplesIf} } \seealso{ Other functions to check model assumptions and and assess model quality: diff --git a/man/compare_performance.Rd b/man/compare_performance.Rd index cfa80eda0..30c324351 100644 --- a/man/compare_performance.Rd +++ b/man/compare_performance.Rd @@ -91,6 +91,7 @@ same (AIC/...) values as from the defaults in \code{AIC.merMod()}. There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method} implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}. } \examples{ +\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} data(iris) lm1 <- lm(Sepal.Length ~ Species, data = iris) lm2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris) @@ -98,12 +99,11 @@ lm3 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris) compare_performance(lm1, lm2, lm3) compare_performance(lm1, lm2, lm3, rank = TRUE) -if (require("lme4")) { - m1 <- lm(mpg ~ wt + cyl, data = mtcars) - m2 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") - m3 <- lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) - compare_performance(m1, m2, m3) -} +m1 <- lm(mpg ~ wt + cyl, data = mtcars) +m2 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") +m3 <- lme4::lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) +compare_performance(m1, m2, m3) +\dontshow{\}) # examplesIf} } \references{ Burnham, K. P., and Anderson, D. R. (2002). diff --git a/man/icc.Rd b/man/icc.Rd index facc40653..d25004e07 100644 --- a/man/icc.Rd +++ b/man/icc.Rd @@ -176,28 +176,26 @@ it is negative. In such cases, it might help to use \code{robust = TRUE}. } } \examples{ -if (require("lme4")) { - model <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) - icc(model) -} +\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +model <- lme4::lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) +icc(model) # ICC for specific group-levels -if (require("lme4")) { - data(sleepstudy) - set.seed(12345) - sleepstudy$grp <- sample(1:5, size = 180, replace = TRUE) - sleepstudy$subgrp <- NA - for (i in 1:5) { - filter_group <- sleepstudy$grp == i - sleepstudy$subgrp[filter_group] <- - sample(1:30, size = sum(filter_group), replace = TRUE) - } - model <- lmer( - Reaction ~ Days + (1 | grp / subgrp) + (1 | Subject), - data = sleepstudy - ) - icc(model, by_group = TRUE) -} +data(sleepstudy, package = "lme4") +set.seed(12345) +sleepstudy$grp <- sample(1:5, size = 180, replace = TRUE) +sleepstudy$subgrp <- NA +for (i in 1:5) { + filter_group <- sleepstudy$grp == i + sleepstudy$subgrp[filter_group] <- + sample(1:30, size = sum(filter_group), replace = TRUE) +} +model <- lme4::lmer( + Reaction ~ Days + (1 | grp / subgrp) + (1 | Subject), + data = sleepstudy +) +icc(model, by_group = TRUE) +\dontshow{\}) # examplesIf} } \references{ \itemize{ diff --git a/man/item_intercor.Rd b/man/item_intercor.Rd index 923c8a3f0..b59f8b8c5 100644 --- a/man/item_intercor.Rd +++ b/man/item_intercor.Rd @@ -11,7 +11,7 @@ item_intercor(x, method = c("pearson", "spearman", "kendall")) or a data frame with items (e.g. from a test or questionnaire).} \item{method}{Correlation computation method. May be one of -\code{"spearman"} (default), \code{"pearson"} or \code{"kendall"}. +\code{"pearson"} (default), \code{"spearman"} or \code{"kendall"}. You may use initial letter only.} } \value{ @@ -22,20 +22,19 @@ Compute various measures of internal consistencies for tests or item-scales of questionnaires. } \details{ -This function calculates a mean inter-item-correlation, i.e. -a correlation matrix of \code{x} will be computed (unless -\code{x} is already a matrix as returned by the \code{cor()}-function) -and the mean of the sum of all item's correlation values is returned. -Requires either a data frame or a computed \code{cor()}-object. -\cr \cr -\dQuote{Ideally, the average inter-item correlation for a set of -items should be between .20 and .40, suggesting that while the -items are reasonably homogeneous, they do contain sufficiently -unique variance so as to not be isomorphic with each other. -When values are lower than .20, then the items may not be -representative of the same content domain. If values are higher than -.40, the items may be only capturing a small bandwidth of the construct.} -\cite{(Piedmont 2014)} +This function calculates a mean inter-item-correlation, i.e. a +correlation matrix of \code{x} will be computed (unless \code{x} is already a matrix +as returned by the \code{cor()} function) and the mean of the sum of all items' +correlation values is returned. Requires either a data frame or a computed +\code{cor()} object. + +"Ideally, the average inter-item correlation for a set of items should be +between 0.20 and 0.40, suggesting that while the items are reasonably +homogeneous, they do contain sufficiently unique variance so as to not be +isomorphic with each other. When values are lower than 0.20, then the items +may not be representative of the same content domain. If values are higher +than 0.40, the items may be only capturing a small bandwidth of the +construct." \emph{(Piedmont 2014)} } \examples{ data(mtcars) diff --git a/man/looic.Rd b/man/looic.Rd index c1893b06d..742ac3482 100644 --- a/man/looic.Rd +++ b/man/looic.Rd @@ -21,8 +21,14 @@ regressions. For LOOIC and ELPD, smaller and larger values are respectively indicative of a better fit. } \examples{ -if (require("rstanarm")) { - model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) - looic(model) -} +\dontshow{if (require("rstanarm")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +model <- suppressWarnings(rstanarm::stan_glm( + mpg ~ wt + cyl, + data = mtcars, + chains = 1, + iter = 500, + refresh = 0 +)) +looic(model) +\dontshow{\}) # examplesIf} } diff --git a/man/model_performance.lavaan.Rd b/man/model_performance.lavaan.Rd index 9ac05771f..b65e9d0f0 100644 --- a/man/model_performance.lavaan.Rd +++ b/man/model_performance.lavaan.Rd @@ -7,10 +7,14 @@ \method{model_performance}{lavaan}(model, metrics = "all", verbose = TRUE, ...) } \arguments{ -\item{model}{A \pkg{lavaan} model.} +\item{model}{A \strong{lavaan} model.} \item{metrics}{Can be \code{"all"} or a character vector of metrics to be -computed (some of \code{c("Chi2", "Chi2_df", "p_Chi2", "Baseline", "Baseline_df", "p_Baseline", "GFI", "AGFI", "NFI", "NNFI", "CFI", "RMSEA", "RMSEA_CI_low", "RMSEA_CI_high", "p_RMSEA", "RMR", "SRMR", "RFI", "PNFI", "IFI", "RNI", "Loglikelihood", "AIC", "BIC", "BIC_adjusted")}).} +computed (some of \code{"Chi2"}, \code{"Chi2_df"}, \code{"p_Chi2"}, \code{"Baseline"}, +\code{"Baseline_df"}, \code{"p_Baseline"}, \code{"GFI"}, \code{"AGFI"}, \code{"NFI"}, \code{"NNFI"}, +\code{"CFI"}, \code{"RMSEA"}, \code{"RMSEA_CI_low"}, \code{"RMSEA_CI_high"}, \code{"p_RMSEA"}, +\code{"RMR"}, \code{"SRMR"}, \code{"RFI"}, \code{"PNFI"}, \code{"IFI"}, \code{"RNI"}, \code{"Loglikelihood"}, +\code{"AIC"}, \code{"BIC"}, and \code{"BIC_adjusted"}.} \item{verbose}{Toggle off warnings.} @@ -22,7 +26,7 @@ A data frame (with one row) and one column per "index" (see } \description{ Compute indices of model performance for SEM or CFA models from the -\pkg{lavaan} package. +\strong{lavaan} package. } \details{ \subsection{Indices of fit}{ @@ -73,15 +77,15 @@ and the \strong{SRMR}. } } \examples{ +\dontshow{if (require("lavaan")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} # Confirmatory Factor Analysis (CFA) --------- -if (require("lavaan")) { - structure <- " visual =~ x1 + x2 + x3 - textual =~ x4 + x5 + x6 - speed =~ x7 + x8 + x9 " - model <- lavaan::cfa(structure, data = HolzingerSwineford1939) - model_performance(model) -} - +data(HolzingerSwineford1939, package = "lavaan") +structure <- " visual =~ x1 + x2 + x3 + textual =~ x4 + x5 + x6 + speed =~ x7 + x8 + x9 " +model <- lavaan::cfa(structure, data = HolzingerSwineford1939) +model_performance(model) +\dontshow{\}) # examplesIf} } \references{ \itemize{ diff --git a/man/model_performance.merMod.Rd b/man/model_performance.merMod.Rd index 2145ec379..519f1ee0a 100644 --- a/man/model_performance.merMod.Rd +++ b/man/model_performance.merMod.Rd @@ -58,8 +58,8 @@ on returned indices. } } \examples{ -if (require("lme4")) { - model <- lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) - model_performance(model) -} +\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +model <- lme4::lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris) +model_performance(model) +\dontshow{\}) # examplesIf} } diff --git a/man/model_performance.rma.Rd b/man/model_performance.rma.Rd index 7ace733b1..69d1923ba 100644 --- a/man/model_performance.rma.Rd +++ b/man/model_performance.rma.Rd @@ -65,10 +65,17 @@ See the documentation for \code{?metafor::fitstats}. } } \examples{ -if (require("metafor")) { - data(dat.bcg) - dat <- escalc(measure = "RR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg) - model <- rma(yi, vi, data = dat, method = "REML") - model_performance(model) -} +\dontshow{if (require("metafor") && require("metadat")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +data(dat.bcg, package = "metadat") +dat <- metafor::escalc( + measure = "RR", + ai = tpos, + bi = tneg, + ci = cpos, + di = cneg, + data = dat.bcg +) +model <- metafor::rma(yi, vi, data = dat, method = "REML") +model_performance(model) +\dontshow{\}) # examplesIf} } diff --git a/man/model_performance.stanreg.Rd b/man/model_performance.stanreg.Rd index 1b1a421e7..bbd82bc53 100644 --- a/man/model_performance.stanreg.Rd +++ b/man/model_performance.stanreg.Rd @@ -60,30 +60,33 @@ values mean better fit. See \code{?loo::waic}. } } \examples{ -\dontrun{ -if (require("rstanarm") && require("rstantools")) { - model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) - model_performance(model) +\dontshow{if (require("rstanarm") && require("rstantools") && require("BayesFactor")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +model <- suppressWarnings(rstanarm::stan_glm( + mpg ~ wt + cyl, + data = mtcars, + chains = 1, + iter = 500, + refresh = 0 +)) +model_performance(model) - model <- stan_glmer( - mpg ~ wt + cyl + (1 | gear), - data = mtcars, - chains = 1, - iter = 500, - refresh = 0 - ) - model_performance(model) -} - -if (require("BayesFactor") && require("rstantools")) { - model <- generalTestBF(carb ~ am + mpg, mtcars) +model <- suppressWarnings(rstanarm::stan_glmer( + mpg ~ wt + cyl + (1 | gear), + data = mtcars, + chains = 1, + iter = 500, + refresh = 0 +)) +model_performance(model) - model_performance(model) - model_performance(model[3]) +model <- BayesFactor::generalTestBF(carb ~ am + mpg, mtcars) - model_performance(model, average = TRUE) -} +model_performance(model) +model_performance(model[3]) +model_performance(model, average = TRUE) } +\dontshow{\}) # examplesIf} } \references{ Gelman, A., Goodrich, B., Gabry, J., and Vehtari, A. (2018). diff --git a/man/performance_accuracy.Rd b/man/performance_accuracy.Rd index 10c33e77d..4ea166f02 100644 --- a/man/performance_accuracy.Rd +++ b/man/performance_accuracy.Rd @@ -9,6 +9,7 @@ performance_accuracy( method = c("cv", "boot"), k = 5, n = 1000, + ci = 0.95, verbose = TRUE ) } @@ -24,6 +25,8 @@ compute the accuracy values.} \item{n}{Number of bootstrap-samples.} +\item{ci}{The level of the confidence interval.} + \item{verbose}{Toggle warnings.} } \value{ diff --git a/man/performance_rmse.Rd b/man/performance_rmse.Rd index cd9b84e87..bea4534b5 100644 --- a/man/performance_rmse.Rd +++ b/man/performance_rmse.Rd @@ -35,13 +35,14 @@ range of the response variable. Hence, lower values indicate less residual variance. } \examples{ -if (require("nlme")) { - m <- lme(distance ~ age, data = Orthodont) +\dontshow{if (require("nlme")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +data(Orthodont, package = "nlme") +m <- nlme::lme(distance ~ age, data = Orthodont) - # RMSE - performance_rmse(m, normalized = FALSE) +# RMSE +performance_rmse(m, normalized = FALSE) - # normalized RMSE - performance_rmse(m, normalized = TRUE) -} +# normalized RMSE +performance_rmse(m, normalized = TRUE) +\dontshow{\}) # examplesIf} } diff --git a/man/performance_score.Rd b/man/performance_score.Rd index 2dc85faf5..21e72ca24 100644 --- a/man/performance_score.Rd +++ b/man/performance_score.Rd @@ -38,6 +38,7 @@ Code is partially based on \href{https://drizopoulos.github.io/GLMMadaptive/reference/scoring_rules.html}{GLMMadaptive::scoring_rules()}. } \examples{ +\dontshow{if (require("glmmTMB")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} ## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- gl(3, 1, 9) @@ -45,19 +46,18 @@ treatment <- gl(3, 3) model <- glm(counts ~ outcome + treatment, family = poisson()) performance_score(model) -\dontrun{ -if (require("glmmTMB")) { - data(Salamanders) - model <- glmmTMB( - count ~ spp + mined + (1 | site), - zi = ~ spp + mined, - family = nbinom2(), - data = Salamanders - ) +\donttest{ +data(Salamanders, package = "glmmTMB") +model <- glmmTMB::glmmTMB( + count ~ spp + mined + (1 | site), + zi = ~ spp + mined, + family = nbinom2(), + data = Salamanders +) - performance_score(model) -} +performance_score(model) } +\dontshow{\}) # examplesIf} } \references{ Carvalho, A. (2016). An overview of applications of proper scoring rules. diff --git a/man/r2.Rd b/man/r2.Rd index 45169aef2..9c5c648c3 100644 --- a/man/r2.Rd +++ b/man/r2.Rd @@ -54,6 +54,7 @@ If there is no \code{r2()}-method defined for the given model class, \verb{1-sum((y-y_hat)^2)/sum((y-y_bar)^2))} } \examples{ +\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} # Pseudo r-quared for GLM model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") r2(model) @@ -62,10 +63,9 @@ r2(model) model <- lm(mpg ~ wt + hp, data = mtcars) r2(model, ci = 0.95) -if (require("lme4")) { - model <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) - r2(model) -} +model <- lme4::lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) +r2(model) +\dontshow{\}) # examplesIf} } \seealso{ \code{\link[=r2_bayes]{r2_bayes()}}, \code{\link[=r2_coxsnell]{r2_coxsnell()}}, \code{\link[=r2_kullback]{r2_kullback()}}, diff --git a/man/r2_bayes.Rd b/man/r2_bayes.Rd index 9f1531225..63b082e08 100644 --- a/man/r2_bayes.Rd +++ b/man/r2_bayes.Rd @@ -64,16 +64,23 @@ returns a posterior sample of Bayesian R2 values. \examples{ library(performance) if (require("rstanarm") && require("rstantools")) { - model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) + model <- suppressWarnings(stan_glm( + mpg ~ wt + cyl, + data = mtcars, + chains = 1, + iter = 500, + refresh = 0, + show_messages = FALSE + )) r2_bayes(model) - model <- stan_lmer( + model <- suppressWarnings(stan_lmer( Petal.Length ~ Petal.Width + (1 | Species), data = iris, chains = 1, iter = 500, refresh = 0 - ) + )) r2_bayes(model) } @@ -97,12 +104,22 @@ if (require("BayesFactor")) { r2_bayes(model) } -\dontrun{ +\donttest{ if (require("brms")) { - model <- brms::brm(mpg ~ wt + cyl, data = mtcars) + model <- suppressWarnings(brms::brm( + mpg ~ wt + cyl, + data = mtcars, + silent = 2, + refresh = 0 + )) r2_bayes(model) - model <- brms::brm(Petal.Length ~ Petal.Width + (1 | Species), data = iris) + model <- suppressWarnings(brms::brm( + Petal.Length ~ Petal.Width + (1 | Species), + data = iris, + silent = 2, + refresh = 0 + )) r2_bayes(model) } } diff --git a/man/r2_loo.Rd b/man/r2_loo.Rd index 18d8b106e..e6592e08c 100644 --- a/man/r2_loo.Rd +++ b/man/r2_loo.Rd @@ -45,15 +45,22 @@ Compute LOO-adjusted R2. leave-one-out-adjusted posterior distribution. This is conceptually similar to an adjusted/unbiased R2 estimate in classical regression modeling. See \code{\link[=r2_bayes]{r2_bayes()}} for an "unadjusted" R2. -\cr \cr + Mixed models are not currently fully supported. -\cr \cr + \code{r2_loo_posterior()} is the actual workhorse for \code{r2_loo()} and returns a posterior sample of LOO-adjusted Bayesian R2 values. } \examples{ -if (require("rstanarm")) { - model <- stan_glm(mpg ~ wt + cyl, data = mtcars, chains = 1, iter = 500, refresh = 0) - r2_loo(model) -} +\dontshow{if (require("rstanarm") && require("rstantools")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +model <- suppressWarnings(rstanarm::stan_glm( + mpg ~ wt + cyl, + data = mtcars, + chains = 1, + iter = 500, + refresh = 0, + show_messages = FALSE +)) +r2_loo(model) +\dontshow{\}) # examplesIf} } diff --git a/man/r2_nakagawa.Rd b/man/r2_nakagawa.Rd index 3bfa31fed..357c2a3e9 100644 --- a/man/r2_nakagawa.Rd +++ b/man/r2_nakagawa.Rd @@ -74,11 +74,11 @@ The contribution of random effects can be deduced by subtracting the marginal R2 from the conditional R2 or by computing the \code{\link[=icc]{icc()}}. } \examples{ -if (require("lme4")) { - model <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) - r2_nakagawa(model) - r2_nakagawa(model, by_group = TRUE) -} +\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +model <- lme4::lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris) +r2_nakagawa(model) +r2_nakagawa(model, by_group = TRUE) +\dontshow{\}) # examplesIf} } \references{ \itemize{ diff --git a/man/r2_somers.Rd b/man/r2_somers.Rd index 679212ccc..4f6d666e0 100644 --- a/man/r2_somers.Rd +++ b/man/r2_somers.Rd @@ -16,7 +16,7 @@ A named vector with the R2 value. Calculates the Somers' Dxy rank correlation for logistic regression models. } \examples{ -\dontrun{ +\donttest{ if (require("correlation") && require("Hmisc")) { model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") r2_somers(model) diff --git a/man/test_performance.Rd b/man/test_performance.Rd index e4701bea3..6b2953ed6 100644 --- a/man/test_performance.Rd +++ b/man/test_performance.Rd @@ -66,11 +66,11 @@ and their interpretation. Model's "nesting" is an important concept of models comparison. Indeed, many tests only make sense when the models are \emph{"nested",} i.e., when their -predictors are nested. This means that all the predictors of a model are -contained within the predictors of a larger model (sometimes referred to as -the encompassing model). For instance, \code{model1 (y ~ x1 + x2)} is -"nested" within \code{model2 (y ~ x1 + x2 + x3)}. Usually, people have a list -of nested models, for instance \code{m1 (y ~ 1)}, \code{m2 (y ~ x1)}, +predictors are nested. This means that all the \emph{fixed effects} predictors of +a model are contained within the \emph{fixed effects} predictors of a larger model +(sometimes referred to as the encompassing model). For instance, +\code{model1 (y ~ x1 + x2)} is "nested" within \code{model2 (y ~ x1 + x2 + x3)}. Usually, +people have a list of nested models, for instance \code{m1 (y ~ 1)}, \code{m2 (y ~ x1)}, \code{m3 (y ~ x1 + x2)}, \code{m4 (y ~ x1 + x2 + x3)}, and it is conventional that they are "ordered" from the smallest to largest, but it is up to the user to reverse the order from largest to smallest. The test then shows diff --git a/tests/testthat/_snaps/check_collinearity.md b/tests/testthat/_snaps/check_collinearity.md index 7e99aa977..3e9aa24b7 100644 --- a/tests/testthat/_snaps/check_collinearity.md +++ b/tests/testthat/_snaps/check_collinearity.md @@ -7,8 +7,8 @@ Low Correlation - Term VIF SE_factor Increased SE - N 1.00 1.00 1.00 - P 1.00 1.00 1.00 - K 1.00 1.00 1.00 + Term VIF Increased SE Tolerance + N 1.00 1.00 1.00 + P 1.00 1.00 1.00 + K 1.00 1.00 1.00 diff --git a/tests/testthat/test-check_collinearity.R b/tests/testthat/test-check_collinearity.R index 8c9d98961..3d68b87ac 100644 --- a/tests/testthat/test-check_collinearity.R +++ b/tests/testthat/test-check_collinearity.R @@ -175,3 +175,39 @@ test_that("check_collinearity, ci = NULL", { # 518 ) expect_snapshot(out) }) + +test_that("check_collinearity, ci are NA", { + skip_if_not_installed("fixest") + data(mtcars) + i <- fixest::i + m_vif <- fixest::feols(mpg ~ disp + hp + wt + i(cyl) | carb, data = mtcars) + out <- suppressWarnings(check_collinearity(m_vif)) + expect_identical( + colnames(out), + c( + "Term", "VIF", "VIF_CI_low", "VIF_CI_high", "SE_factor", "Tolerance", + "Tolerance_CI_low", "Tolerance_CI_high" + ) + ) +}) + +test_that("check_collinearity, hurdle/zi models w/o zi-formula", { + skip_if_not_installed("pscl") + data("bioChemists", package = "pscl") + m <- pscl::hurdle( + art ~ fem + mar, + data = bioChemists, + dist = "poisson", + zero.dist = "binomial", + link = "logit" + ) + out <- check_collinearity(m) + expect_identical( + colnames(out), + c( + "Term", "VIF", "VIF_CI_low", "VIF_CI_high", "SE_factor", "Tolerance", + "Tolerance_CI_low", "Tolerance_CI_high", "Component" + ) + ) + expect_equal(out$VIF, c(1.05772, 1.05772, 1.06587, 1.06587), tolerance = 1e-4) +}) diff --git a/tests/testthat/test-check_outliers.R b/tests/testthat/test-check_outliers.R index 9ce9b3f6c..e464028d0 100644 --- a/tests/testthat/test-check_outliers.R +++ b/tests/testthat/test-check_outliers.R @@ -1,7 +1,3 @@ -skip_if_not_installed("bigutilsr") -skip_if_not_installed("ICS") -skip_if_not_installed("dbscan") - test_that("zscore negative threshold", { expect_error( check_outliers(mtcars$mpg, method = "zscore", threshold = -1), @@ -80,16 +76,15 @@ test_that("mahalanobis which", { }) test_that("mahalanobis_robust which", { + skip_if_not_installed("bigutilsr") expect_identical( which(check_outliers(mtcars, method = "mahalanobis_robust", threshold = 25)), as.integer(c(7, 9, 21, 24, 27, 28, 29, 31)) ) }) -## FIXME: Fails on CRAN/windows -# (should be fixed but not clear why method mcd needs a seed; -# there should not be an element of randomness to it I think) test_that("mcd which", { + # (not clear why method mcd needs a seed) set.seed(42) expect_identical( tail(which(check_outliers(mtcars[1:4], method = "mcd", threshold = 45))), @@ -98,10 +93,12 @@ test_that("mcd which", { }) ## FIXME: Fails on CRAN/windows -# (current CRAN version rstan is not compatible with R > 4.2) test_that("ics which", { - skip_if_not_installed("rstan", minimum_version = "2.26.0") - set.seed(42) + # suddenly fails on R Under development (unstable) (2023-09-07 r85102) + # gcc-13 (Debian 13.2.0-2) 13.2.0 + skip_on_cran() + skip_if_not_installed("ICS") + skip_if_not_installed("ICSOutlier") expect_identical( which(check_outliers(mtcars, method = "ics", threshold = 0.001)), as.integer(c(9, 29)) @@ -109,6 +106,7 @@ test_that("ics which", { }) test_that("optics which", { + skip_if_not_installed("dbscan") expect_identical( which(check_outliers(mtcars, method = "optics", threshold = 14)), as.integer(c(5, 7, 15, 16, 17, 24, 25, 29, 31)) @@ -116,6 +114,7 @@ test_that("optics which", { }) test_that("lof which", { + skip_if_not_installed("dbscan") expect_identical( which(check_outliers(mtcars, method = "lof", threshold = 0.005)), 31L @@ -190,6 +189,7 @@ test_that("multiple methods which", { # We exclude method ics because it is too slow test_that("all methods which", { + skip_if_not_installed("bigutilsr") expect_identical( which(check_outliers(mtcars, method = c( @@ -211,6 +211,7 @@ test_that("all methods which", { test_that("multiple methods with ID", { + skip_if_not_installed("bigutilsr") data <- datawizard::rownames_as_column(mtcars, var = "car") x <- attributes(check_outliers(data, method = c( @@ -258,6 +259,7 @@ test_that("cook which", { # }) test_that("cook multiple methods which", { + skip_if_not_installed("dbscan") model <- lm(disp ~ mpg + hp, data = mtcars) expect_identical( which(check_outliers(model, method = c("cook", "optics", "lof"))), @@ -266,6 +268,7 @@ test_that("cook multiple methods which", { }) test_that("pareto which", { + skip_if_not_installed("dbscan") skip_if_not_installed("rstanarm") set.seed(123) model <- rstanarm::stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0) @@ -278,6 +281,8 @@ test_that("pareto which", { }) test_that("pareto multiple methods which", { + skip_if_not_installed("dbscan") + skip_if_not_installed("rstanarm") set.seed(123) model <- rstanarm::stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0) invisible(capture.output(model)) diff --git a/tests/testthat/test-cronbachs_alpha.R b/tests/testthat/test-cronbachs_alpha.R index 1c45d8db5..a2670e979 100644 --- a/tests/testthat/test-cronbachs_alpha.R +++ b/tests/testthat/test-cronbachs_alpha.R @@ -9,10 +9,14 @@ test_that("cronbachs_alpha", { test_that("cronbachs_alpha, principal_components", { - skip_if_not_installed("parameters", minimum_version = "0.20.3") - pca <- parameters::principal_components(mtcars[, c("cyl", "gear", "carb", "hp")], n = 1) + skip_if_not_installed("parameters", minimum_version = "0.21.2.1") + pca <- parameters::principal_components(mtcars[, c("cyl", "gear", "carb", "hp")], n = 2) expect_equal(cronbachs_alpha(pca, verbose = FALSE), c(PC1 = 0.1101384), tolerance = 1e-3) expect_warning(cronbachs_alpha(pca)) + + pca <- parameters::principal_components(mtcars[, c("cyl", "gear", "carb", "hp")], n = 1) + expect_equal(cronbachs_alpha(pca, verbose = FALSE), c(PC1 = 0.09463206), tolerance = 1e-3) + expect_silent(cronbachs_alpha(pca)) }) test_that("cronbachs_alpha, principal_components", { diff --git a/tests/testthat/test-performance_auc.R b/tests/testthat/test-performance_auc.R new file mode 100644 index 000000000..f23a1f4bf --- /dev/null +++ b/tests/testthat/test-performance_auc.R @@ -0,0 +1,41 @@ +test_that("performance_auc", { + model_auc <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial") + # message + set.seed(3) + expect_message({ + out <- performance_accuracy(model_auc) + }) + expect_equal(out$Accuracy, 0.75833, tolerance = 1e-3) + expect_equal(out$CI_low, 0.6, tolerance = 1e-3) + expect_equal(out$CI_high, 0.9875, tolerance = 1e-3) + + set.seed(12) + expect_message({ + out <- performance_accuracy(model_auc) + }) + expect_equal(out$Accuracy, 0.97222, tolerance = 1e-3) + expect_equal(out$CI_low, 0.89722, tolerance = 1e-3) + expect_equal(out$CI_high, 1, tolerance = 1e-3) + + # message + set.seed(3) + expect_message({ + out <- performance_accuracy(model_auc, ci = 0.8) + }) + expect_equal(out$Accuracy, 0.75833, tolerance = 1e-3) + expect_equal(out$CI_low, 0.6, tolerance = 1e-3) + expect_equal(out$CI_high, 0.95, tolerance = 1e-3) + + model_auc <- lm(mpg ~ wt + cyl, data = mtcars) + set.seed(123) + out <- performance_accuracy(model_auc) + expect_equal(out$Accuracy, 0.94303, tolerance = 1e-3) + expect_equal(out$CI_low, 0.8804, tolerance = 1e-3) + expect_equal(out$CI_high, 0.98231, tolerance = 1e-3) + + set.seed(123) + out <- performance_accuracy(model_auc, ci = 0.8) + expect_equal(out$Accuracy, 0.94303, tolerance = 1e-3) + expect_equal(out$CI_low, 0.90197, tolerance = 1e-3) + expect_equal(out$CI_high, 0.97567, tolerance = 1e-3) +}) diff --git a/vignettes/check_model.Rmd b/vignettes/check_model.Rmd index 24d8caa58..7e9ff0506 100644 --- a/vignettes/check_model.Rmd +++ b/vignettes/check_model.Rmd @@ -1,5 +1,5 @@ --- -title: "Checking model assumption" +title: "Checking model assumption - linear models" output: rmarkdown::html_vignette: toc: true @@ -8,7 +8,7 @@ output: tags: [r, performance, r2] vignette: > \usepackage[utf8]{inputenc} - %\VignetteIndexEntry{Checking model assumption} + %\VignetteIndexEntry{Checking model assumption - linear models} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console @@ -117,6 +117,23 @@ As you can see, the green line in this plot deviates visibly from the blue lines The best way, if there are serious concerns that the model does not fit well to the data, is to use a different type (family) of regression models. In our example, it is obvious that we should better use a Poisson regression. +#### Plots for discrete outcomes + +For discrete or integer outcomes (like in logistic or Poisson regression), density plots are not always the best choice, as they look somewhat "wiggly" around the actual values of the dependent variables. In this case, use the `type` argument of the `plot()` method to change the plot-style. Available options are `type = "discrete_dots"` (dots for observed and replicated outcomes), `type = "discrete_interval"` (dots for observed, error bars for replicated outcomes) or `type = "discrete_both"` (both dots and error bars). + +```{r eval=all(successfully_loaded[c("see", "ggplot2")]), warning=FALSE} +set.seed(99) +d <- iris +d$skewed <- rpois(150, 1) +m3 <- glm( + skewed ~ Species + Petal.Length + Petal.Width, + family = poisson(), + data = d +) +out <- check_predictions(m3) +plot(out, type = "discrete_both") +``` + ### Linearity This plot helps to check the assumption of linear relationship. It shows whether predictors may have a non-linear relationship with the outcome, in which case the reference line may roughly indicate that relationship. A straight and horizontal line indicates that the model specification seems to be ok. @@ -191,7 +208,7 @@ There are several ways to address heteroscedasticity. 1. Calculating heteroscedasticity-consistent standard errors accounts for the larger variation, better reflecting the increased uncertainty. This can be easily done using the **parameters** package, e.g. `parameters::model_parameters(m1, vcov = "HC3")`. A detailed vignette on robust standard errors [can be found here](https://easystats.github.io/parameters/articles/model_parameters_robust.html). -2. The heteroscedasticity can be modeled directly, e.g. using package **glmmTMB** and the dispersion formular, to estimate the dispersion parameter and account for heteroscedasticity (see _Brooks et al. 2017_). +2. The heteroscedasticity can be modeled directly, e.g. using package **glmmTMB** and the dispersion formula, to estimate the dispersion parameter and account for heteroscedasticity (see _Brooks et al. 2017_). 3. Transforming the response variable, for instance, taking the `log()`, may also help to avoid issues with heteroscedasticity. @@ -242,7 +259,7 @@ Usually, dots should fall along the green reference line. If there is some devia diagnostic_plots[[6]] ``` -In our example, we see that most data points are ok, except some observations at the tails. Whether any action is needed to fix this or not can also depend on the results of the remaining diagnostic plots. If all other plots indicate no violation of assumptions, some deviation of normality, particularly at the tails, can be less critcal. +In our example, we see that most data points are ok, except some observations at the tails. Whether any action is needed to fix this or not can also depend on the results of the remaining diagnostic plots. If all other plots indicate no violation of assumptions, some deviation of normality, particularly at the tails, can be less critical. #### How to fix this?