Non-collapsibility: additive vs multiplicative models #234

malcolmbarrett · 2024-03-12T17:05:20Z

I think I have a working simulation that shows the contrast in non-collapsibility effects in models that were generated with additive and multiplicative models.

@LucyMcGowan, what do you think?

One thing I find weird but may be doing incorrectly: the LPM works for the additive model but not for the multiplicative model. I would have guessed using the LPM would workish for the multiplicative model because even though additive, it's collapsible. Could it be due to something in the generating process? Is my expectation wrong about what log odds ratio it should produce?

set.seed(12345)
library(tidyverse)
n <- 1000000

multiplicative_data <- tibble(
  x1 = rnorm(n, mean = 0, sd = 1),
  x2 = rnorm(n, mean = 0, sd = 1),
  x3 = rnorm(n, mean = 0, sd = 1)
)

# baseline log-odds
beta0 <- -1
# coef log-odds ratio
beta1 <- 1.1
beta2 <- -1.1
beta3 <- 0.5

# calculate multiplicative probability on logistic scale
prob <- with(
  multiplicative_data, 
  plogis(beta0 + beta1 * x1 + beta2 * x2 + beta3 * x3 + rnorm(n, sd = .1))
)

multiplicative_data <- multiplicative_data |>
  mutate(y = rbinom(n, 1, prob))

# answer is wrong without all three
glm(y ~ x1, data = multiplicative_data, family = binomial())$coefficients
#> (Intercept)          x1 
#>  -0.7873634   0.8763435
glm(y ~ x1 + x2 + x3, data = multiplicative_data, family = binomial())$coefficients
#> (Intercept)          x1          x2          x3 
#>  -0.9956394   1.0980438  -1.1027489   0.4968647

# using a linear probability model
coefs <- lm(y ~ x1, data = multiplicative_data)$coefficients
coefs
#> (Intercept)          x1 
#>   0.3368481   0.1696988
# log odds ratio. Also wrong?
log((coefs[2] / (2 - coefs[2])) / (coefs[1] / (1 - coefs[1])))
#>        x1 
#> -1.700838


additive_data <- tibble(
  x1 = rnorm(n, mean = 0, sd = 1),
  x2 = rnorm(n, mean = 0, sd = 1),
  x3 = rnorm(n, mean = 0, sd = 1)
)

# Baseline probability
beta0 <- 0.5
# Coef risk difference
beta1 <- 0.1  
beta2 <- 0.1  
beta3 <- 0.1

# Calculate probabilities using a linear combination of predictors
# Clamp to 0 and 1
prob <- with(additive_data, beta0 + beta1 * x1 + beta2 * x2 + beta3 * x3)
prob <- pmin(pmax(prob, 0), 1)

additive_data <- additive_data |>
  mutate(y = rbinom(n, 1, prob))

# expected log odds ratio of each coef:
# log((.6/(1-.6)) / (.5/(1-.5))) = 0.4055
# answer is wrong with more than one predictor of y
glm(y ~ x1, data = additive_data, family = binomial())$coefficients
#>  (Intercept)           x1 
#> -0.001851411  0.417459645
glm(y ~ x1 + x2 + x3, data = additive_data, family = binomial())$coefficients
#>  (Intercept)           x1           x2           x3 
#> -0.001874116  0.455620087  0.452392691  0.454884135

# correct on the linear, collapsible scale
lm(y ~ x1, data = additive_data)$coefficients
#> (Intercept)          x1 
#>   0.4995522   0.1001733
lm(y ~ x1 + x2 + x3, data = additive_data)$coefficients
#> (Intercept)          x1          x2          x3 
#>  0.49955491  0.10017435  0.09946685  0.09999777

^{Created on 2024-03-12 with reprex v2.1.0}


<sup>Created on 2024-03-12 with [reprex v2.1.0](https://reprex.tidyverse.org)</sup>

The text was updated successfully, but these errors were encountered:

malcolmbarrett · 2024-03-12T17:57:21Z

Note: additive model is collapsible but wrong, e.g. adding the other two will give the same (wrong?) answer. It's because it's the wrong data gen mechanism

malcolmbarrett · 2024-03-12T17:58:40Z

log odds calc is wrong for the first one because it doesn't include the baseline probability

LucyMcGowan · 2024-03-12T19:41:08Z

just to follow up on your comment, I think this should be:

log(((coefs[1] + coefs[2]) / (1 - (coefs[1] + coefs[2]))) / (coefs[1] / (1 - coefs[1])))

but also this is kind of a weird thing to do here because it will be different for every difference in x1 I think (like here we are saying the difference between x1 = 0 and x1 = 1, but because the odds ratios are not linear if instead you looked at the difference between x1 = 1 and x1 = 2 it would be a different value

LucyMcGowan · 2024-03-12T19:46:52Z

I also don't think you want random error here:

  plogis(beta0 + beta1 * x1 + beta2 * x2 + beta3 * x3 + rnorm(n, sd = .1))

because the randomness is already baked in by pulling from plogis, and then also by running through rbinom. I think maybe would make more sense to just have the randomness come from the draw from the binomial distribution, so something like this:

prob <- with(
  multiplicative_data, 
  1 / (1 + exp(-(beta0 + beta1 * x1 + beta2 * x2 + beta3 * x3))
)

update collapsibility comment (related to #234)

LucyMcGowan · 2024-03-15T15:54:03Z

Ah maybe I’m wrong! Is plogis deterministic? This seems to say yes! Edit: oh yeah of course it is 🫣 my poor little brain hahaha. I still think we don’t need rnorm in there since we already do rbinom but plogis is good 😂

https://x.com/noah_greifer/status/1768505576143659132

LucyMcGowan added a commit that referenced this issue Mar 14, 2024

update collapsibility comment (related to #234)

5c7e305

This was referenced Mar 14, 2024

Address collapsibility #236

Closed

update collapsibility comment (related to #234) #235

Merged

malcolmbarrett added a commit that referenced this issue Mar 14, 2024

Merge pull request #235 from r-causal/quartet-collapse

c058e4c

update collapsibility comment (related to #234)

malcolmbarrett added this to the Chapter 11: Causal Estimands milestone Sep 15, 2024

malcolmbarrett mentioned this issue Oct 4, 2024

Add section on binary outcome effects and related topics #271

Merged

1 task

malcolmbarrett closed this as completed in #271 Oct 10, 2024

malcolmbarrett self-assigned this Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-collapsibility: additive vs multiplicative models #234

Non-collapsibility: additive vs multiplicative models #234

malcolmbarrett commented Mar 12, 2024 •

edited

Loading

malcolmbarrett commented Mar 12, 2024

malcolmbarrett commented Mar 12, 2024

LucyMcGowan commented Mar 12, 2024 •

edited

Loading

LucyMcGowan commented Mar 12, 2024

LucyMcGowan commented Mar 15, 2024 •

edited

Loading

Non-collapsibility: additive vs multiplicative models #234

Non-collapsibility: additive vs multiplicative models #234

Comments

malcolmbarrett commented Mar 12, 2024 • edited Loading

malcolmbarrett commented Mar 12, 2024

malcolmbarrett commented Mar 12, 2024

LucyMcGowan commented Mar 12, 2024 • edited Loading

LucyMcGowan commented Mar 12, 2024

LucyMcGowan commented Mar 15, 2024 • edited Loading

malcolmbarrett commented Mar 12, 2024 •

edited

Loading

LucyMcGowan commented Mar 12, 2024 •

edited

Loading

LucyMcGowan commented Mar 15, 2024 •

edited

Loading