Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving binned_residuals() #382

Closed
strengejacke opened this issue Jan 22, 2022 · 35 comments · Fixed by #641
Closed

Improving binned_residuals() #382

strengejacke opened this issue Jan 22, 2022 · 35 comments · Fixed by #641
Labels
Enhancement 💥 Implemented features can be improved or revised

Comments

@strengejacke
Copy link
Member

This isn't the place, but I may have some Opinions about binned_model (I would like a smoothed line added, binomial CIs on the points; I would rather have binomial CIs on the individual points and a confidence interval on the smooth rather than have a dichotomizing region that indicates 'good' and 'bad' points ...). I see that this is copying what arm::binnedplot does though ...

Originally posted by @bbolker in #376 (comment)

@strengejacke strengejacke added the Enhancement 💥 Implemented features can be improved or revised label Jan 22, 2022
@mattansb
Copy link
Member

The docs for binned_residuals are also lacking - they do not explain what is being checked, and there is a line half-missing:
image

strengejacke added a commit that referenced this issue Feb 25, 2022
@strengejacke
Copy link
Member Author

@bbolker regarding the smooth line: does it make much sense to add such a line? Even for models with a larger number of observations, due to binning, there are still just a few data points.

library(lme4)
#> Loading required package: Matrix
library(HSAUR3)
#> Loading required package: tools
library(performance)

gm2 <- glmer(outcome ~ treatment * visit + (1 | patientID),
             data = toenail,
             family = binomial)

nobs(gm2)
#> [1] 1908

binned_residuals(gm2) |> plot() + see::theme_lucid()

Created on 2022-02-25 by the reprex package (v2.0.1)

@strengejacke
Copy link
Member Author

strengejacke commented Feb 25, 2022

(it would look like something like this)

library(lme4)
#> Loading required package: Matrix
library(HSAUR3)
#> Loading required package: tools
library(performance)

gm2 <- glmer(outcome ~ treatment * visit + (1 | patientID),
             data = toenail,
             family = binomial)

nobs(gm2)
#> [1] 1908

binned_residuals(gm2) |> 
  plot() + 
  see::theme_lucid() + 
  ggplot2::stat_smooth(ggplot2::aes(y = ybar, colour = group), se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

binned_residuals(gm2) |> 
  plot() + 
  see::theme_lucid() + 
  ggplot2::stat_smooth(ggplot2::aes(y = ybar), se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2022-02-25 by the reprex package (v2.0.1)

@bwiernik
Copy link
Contributor

https://gist.github.com/bbolker/945e684dc148239364ab0f99c9a1beed

I realized that the residuals from binomials are not binomial (so constructing binomial CIs isn't useful); however I think that a binned plot using mean_cl_boot() to generate CIs is probably fine. See examples and decide what you think is useful or not ...

@bbolker
Copy link

bbolker commented Feb 25, 2022

Don't know. I think I like the smoothed line, but don't have super-strong feelings.

@bwiernik
Copy link
Contributor

Maybe tell it to always use gam() with thin-plate splines and pretty tight penalty instead of loess?

@strengejacke
Copy link
Member Author

And we could add an argument that lets users optionally add such a smooth-line.

@bwiernik
Copy link
Contributor

I mean we should have a smooth line I think, but a more penalized one

@strengejacke
Copy link
Member Author

How to to this with stat_smooth()?

@strengejacke
Copy link
Member Author

I just saw that binned_residuals() includes SEs, maybe we can use those for constructing CIs? It would probably good to compare against mean_ci_boot, but I'm not sure how...

library(ggplot2)
library(performance)
model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
result <- binned_residuals(model)
plot(result) + geom_errorbar(aes(ymin = CI_low, ymax = CI_high, colour = group))

aa <- broom::augment(model)

gg1 <- ggplot(aa, aes(.fitted, .resid)) +
  stat_summary_bin(fun.data = mean_cl_boot) +
  geom_hline(yintercept = 0, lty = 2) +
  geom_smooth()

gg1
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> Warning: Removed 11 rows containing missing values (geom_segment).

Created on 2022-02-28 by the reprex package (v2.0.1)

@strengejacke
Copy link
Member Author

Another example:

library(ggplot2)
library(performance)
library(HSAUR3)
#> Loading required package: tools

gm2 <- glm(outcome ~ treatment * visit,
             data = toenail,
             family = binomial)

nobs(gm2)
#> [1] 1908

binned_residuals(gm2) |> 
  plot() + 
  stat_smooth(method = "gam", ggplot2::aes(y = ybar), colour = "grey70", se = FALSE) +
  geom_errorbar(aes(ymin = CI_low, ymax = CI_high, colour = group)) +
  see::theme_lucid()
#> `geom_smooth()` using formula 'y ~ s(x, bs = "cs")'

Created on 2022-02-28 by the reprex package (v2.0.1)

@bwiernik
Copy link
Contributor

bwiernik commented Feb 28, 2022

I think the CI_low and CI_high values might be wrong as currently written:

d <- suppressWarnings(lapply(1:n_bins, function(.x) {
    items <- (1:length(pred))[model.binned == .x]
    model.range <- range(pred[items], na.rm = TRUE)
    xbar <- mean(pred[items], na.rm = TRUE)
    ybar <- mean(y[items], na.rm = TRUE)
    n <- length(items)
    sdev <- stats::sd(y[items], na.rm = TRUE)

    data.frame(
      xbar = xbar,
      ybar = ybar,
      n = n,
      x.lo = model.range[1],
      x.hi = model.range[2],
      se = 2 * sdev / sqrt(n)
    )
  }))

  d <- do.call(rbind, d)
  d <- d[stats::complete.cases(d), ]

  # CIs
  d$CI_low <- d$ybar - stats::qnorm(.975) * d$se
  d$CI_high <- d$ybar + stats::qnorm(.975) * d$se

The computation of se already has 2 * in it, so I think those are 95% confidence margins, not SEs. But maybe I'm misunderstanding

@strengejacke
Copy link
Member Author

Let me look into the Gelman/Hill book, I'm, not sure, but I think I roughly adopted the code from that book, or the arm-package.

strengejacke added a commit that referenced this issue Mar 1, 2022
@strengejacke
Copy link
Member Author

I think you're right, this looks better to me:

library(ggplot2)
library(performance)
library(HSAUR3)
#> Loading required package: tools
#> Loading required package: tools

gm2 <- glm(outcome ~ treatment * visit,
           data = toenail,
           family = binomial)

binned_residuals(gm2) |> 
  plot() + 
  stat_smooth(method = "gam", aes(y = ybar), colour = "grey70", se = FALSE) +
  geom_errorbar(aes(ymin = CI_low, ymax = CI_high, colour = group)) +
  see::theme_lucid()
#> `geom_smooth()` using formula 'y ~ s(x, bs = "cs")'

Created on 2022-03-01 by the reprex package (v2.0.1)

@bwiernik
Copy link
Contributor

bwiernik commented Mar 1, 2022

Yeah, Gelman tends to approximate the critical value for 95% intervals as "2", so that's what jumped out to me in the code

@strengejacke
Copy link
Member Author

strengejacke commented Mar 1, 2022

The range of the CI is then identical to the range of error bound per bin - it's just that due to the point estimate, error bars of the point estimates cross the error bounds. So, the "se" refers to the error bounds, probably not to the point estimates (unless these are the same)

@bwiernik
Copy link
Contributor

bwiernik commented Mar 1, 2022

They are the same. The SE is the standard error of the point estimate (it's the typical standard error of the mean SD / sqrt(N) ). It's just a question of whether the error bands are drawn around the point estimate or the null hypothesis value.

I think I like it with both the error band around zero and the error bars on the points. The error bars help to interpret the smooth line I think.

What do you think?

@strengejacke
Copy link
Member Author

Yes, I'd also say we should keep the error bars. What about the smooth line, you were suggesting using method = "gam", is this enough?

stat_smooth(method = "gam", aes(y = ybar), colour = "grey70", se = FALSE)

@strengejacke
Copy link
Member Author

btw, when the term argument is used, we already add a smooth-line:

library(ggplot2)
library(performance)
library(HSAUR3)
#> Loading required package: tools

gm2 <- glm(outcome ~ treatment * visit,
           data = toenail,
           family = binomial)

binned_residuals(gm2, term = "visit") |> 
  plot()

Created on 2022-03-01 by the reprex package (v2.0.1)

@bwiernik
Copy link
Contributor

bwiernik commented Mar 1, 2022

I think that looks good. Should we make the line color the same as in the other plots? (Green I think?)

@strengejacke
Copy link
Member Author

(its already green)
should we use "loess" or "gam" (and if gam, some special formula?

@strengejacke
Copy link
Member Author

library(performance)
library(HSAUR3)
#> Loading required package: tools

gm2 <- glm(outcome ~ treatment * visit,
           data = toenail,
           family = binomial)

binned_residuals(gm2) |> plot()

Created on 2022-03-01 by the reprex package (v2.0.1)

@bwiernik
Copy link
Contributor

bwiernik commented Mar 1, 2022

Looks good to me! Let's add theme_modern() to be line the others in check_model()

@strengejacke
Copy link
Member Author

Yeah, this was the standalone-plot. for check_model, this is the default:

image

Don't forget #376 (comment) :-)

@bwiernik
Copy link
Contributor

bwiernik commented Mar 1, 2022

Thanks for the reminder

@bwiernik
Copy link
Contributor

bwiernik commented Mar 1, 2022

Also going to make a better PP check plot for categorical responses

@bbolker
Copy link

bbolker commented Oct 23, 2023

I might like to reopen this. The first plot is from check_model(), the second is from plot(binned_residuals()) (not sure why these differ?) -- the latter is incomplete with warnings etc. This is a case where the Gaussian approx for the distribution of binomial residuals is really bad. I know we talked earlier about using bootstrap residuals; a shifted exact binomial CI might work as well ...

It would also be nice to have an option to do the binned-residuals plot on a logit-logit scale, as that would show nonlinear patterns on the link scale more reliably ...

library(performance)

library(ggplot2); theme_set(theme_bw())
data("kyphosis", package = "rpart")
kyphosis <- transform(
    kyphosis,
    Kyphosis = as.numeric(factor(Kyphosis))-1
)
m <- glm(
    Kyphosis~Age+Start+Number,
    data=kyphosis,
    family = binomial(link = "logit")
)
check_model(m)
plot(binned_residuals(m))

Screenshot from 2023-10-22 19-58-47

Screenshot from 2023-10-22 19-59-08

@bwiernik
Copy link
Contributor

The missing points and error bars might be due to them going out of bounds? Could you elaborate on the details of what you're suggesting with logit-logit?

@bwiernik bwiernik reopened this Oct 23, 2023
@bbolker
Copy link

bbolker commented Oct 23, 2023

At least for the x axis, it would be nice to be able to plot on the log-odds rather than the probability scale. Not sure whether this makes sense for the y-axis or not (which shows residuals, rather than predictions — not sure what kinds of residuals these are, or whether it makes sense to think about them on the link or the linear predictor scale; if they're Pearson, then it's probably OK).

strengejacke added a commit to easystats/see that referenced this issue Oct 24, 2023
@strengejacke
Copy link
Member Author

Dot issue should be fixed:

library(performance)
library(ggplot2)
theme_set(theme_bw())
data("kyphosis", package = "rpart")
kyphosis <- transform(
  kyphosis,
  Kyphosis = as.numeric(factor(Kyphosis)) - 1
)
m <- glm(
  Kyphosis ~ Age + Start + Number,
  data = kyphosis,
  family = binomial(link = "logit")
)
check_model(m)

plot(binned_residuals(m))

check_model(m, show_dots = FALSE)

plot(binned_residuals(m, show_dots = FALSE))

# or
plot(binned_residuals(m), show_dots = FALSE)

You need:

packageVersion("see")
#> [1] '0.8.0.6'
packageVersion("performance")
#> [1] '0.10.6.2'

(should be available via easystats::install_latest(force = TRUE) in 1-2 hours).

strengejacke added a commit that referenced this issue Oct 24, 2023
@strengejacke
Copy link
Member Author

How would we achieve using a log-odds scale? Binned residuals are based on fitted values, and the "mean" of 0 and 1 for a certain range of the predictors, i.e. everything is on the probability scale by default.

@bbolker
Copy link

bbolker commented Oct 24, 2023

I agree that putting the y-axis on the log-odds scale isn't practical (and if the machinery is using Pearson residuals — I haven't checked — that's probably fine). Thanks for fixing the other issue. The main other adjustment I think would be worthwhile is something to handle the failure of the Gaussian approximation - either bootstrap or shifted exact CIs (what's going on in this case with those lowest four bins is that there are 0/9 successes).

Looks like + scale_x_continuous(trans = "logit") does approximately what I want with the x-axis.

I've been hacking around on my fork. Bootstrapped residuals didn't make much difference. I'm not sure whether shifted the exact binomial CI makes sense, but it seems to ...

Note that the example given in arm::binnedplot uses deviance residuals (which are more similar to Pearson than to raw residuals), whereas binned_residuals effectively uses raw/response-scale residuals (predicted - fitted) - that may be a large source of the issue ...

Here's what I get with my hacked version (shifted exact CIs and a log-odds scale y axis) (there are a few issues left: (1) I haven't adjusted the envelope; (2) I'm not sure what's up with the last bin where the range doesn't overlap the points, maybe shifting doesn't always work? (3) I get "Warning in eval_tidy(x[[2]], data, env) : restarting interrupted promise evaluation" warnings for reasons I don't understand ...

Screenshot from 2023-10-24 10-04-12

This was referenced Oct 24, 2023
@strengejacke
Copy link
Member Author

I have opened a PR (#641) - I first tried to open a PR based on your fork, but this somehow did not work.

Regarding:

(3) I get "Warning in eval_tidy(x[[2]], data, env) : restarting interrupted promise evaluation" warnings for reasons I don't understand ...

This is due to

stats:::binom.test(sum(y0[items]), n)$conf.int - fv

stats:::binom.test(sum(y0[items]), n)$conf.int returns a vector of 2 (ci low/high), while fv are fitted values of the model, thus exact returns a much larger vector than expected:

stats:::binom.test(sum(y0[items]), n)$conf.int
 num [1:2] 0 0.336
y0[items]
 num [1:9] 0 0 0 0 0 0 0 0 0
fv
 Named num [1:81] 0.257 0.1225 0.493 0.458 0.0299 ...

I think this is also the reason for

(2) I'm not sure what's up with the last bin where the range doesn't overlap the points, maybe shifting doesn't always work?

Maybe you can directly commit to #641?

@strengejacke
Copy link
Member Author

What about this?

library(performance)
data("kyphosis", package = "rpart")
kyphosis <- transform(
  kyphosis,
  Kyphosis = as.numeric(factor(Kyphosis)) - 1
)
m <- glm(
  Kyphosis ~ Age + Start + Number,
  data = kyphosis,
  family = binomial(link = "logit")
)

plot(binned_residuals(m, ci_type = "exact"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "gaussian"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "boot"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "exact", residuals = "response"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "exact", residuals = "deviance"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "exact", residuals = "pearson"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "gaussian", residuals = "response"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "gaussian", residuals = "deviance"), show_dots = TRUE)

plot(binned_residuals(m, ci_type = "gaussian", residuals = "pearson"), show_dots = TRUE)

Created on 2023-10-24 with reprex v2.0.2

@strengejacke
Copy link
Member Author

@bbolker We should decide on sensible defaults. For now, I used deviance residuals and exact CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement 💥 Implemented features can be improved or revised
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants