-
-
Notifications
You must be signed in to change notification settings - Fork 37
/
25-synthdid.Rmd
510 lines (328 loc) · 28 KB
/
25-synthdid.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
# Synthetic Difference-in-Differences
by [@arkhangelsky2021synthetic]
also known as weighted double-differencing estimators
- Setting: Researchers use panel data to study effects of policy changes.
- Panel data: repeated observations across time for various units.
- Some units exposed to policy at different times than others.
- Policy changes often aren't random across units or time.
- Challenge: Observed covariates might not lead to credible conclusions of no confounding [@imbens2015causal]
- To estimate the effects, either
- [Difference-in-differences] (DID) method widely used in applied economics.
- [Synthetic Control] (SC) methods offer alternative approach for comparative case studies.
- Difference between DID and SC:
- DID: used with many policy-exposed units; relies on "parallel trends" assumption.
- SC: used with few policy-exposed units; compensates lack of parallel trends by reweighting units based on pre-exposure trends.
- **New proposition**: Synthetic Difference in Differences (SDID).
- Combines features of DID and SC.
- Reweights and matches pre-exposure trends (similar to SC).
- Invariant to additive unit-level shifts, valid for large-panel inference (like DID).
- Attractive features:
- SDID provides consistent and asymptotically normal estimates.
- SDID performs on par with or better than DID in traditional DID settings.
- where DID can only handle completely random treatment assignment, SDID can handle cases where treatment assignment is correlated with some time or unit latent factors.
- Similarly, SDID is as good as or better than SC in traditional SC settings.
- Uniformly random treatment assignment results in unbiased outcomes for all methods, but SDID is more precise.
- SDID reduces bias effectively for non-uniformly random assignments.
- SDID's double robustness is akin to the augmented inverse probability weighting estimator [@ben2021augmented, @scharfstein1999adjusting].
- Very much similar to augmented SC estimator by [@ben2021augmented; @arkhangelsky2021synthetic, p. 4112]
Ideal case to use SDID estimator is when
- $N_{ctr} \approx T_{pre}$
- Small $T_{post}$
- $N_{tr} <\sqrt{N_{ctr}}$
Applications in marketing:
- @lambrecht2024tv: TV ads on online browsing and sales.
- @keller2024soda: soda tax on marketing effectiveness.
------------------------------------------------------------------------
## Understanding
Consider a traditional time-series cross-sectional data
Let $Y_{it}$ denote the outcome for unit $i$ in period $t$
A balanced panel of $N$ units and $T$ time periods
- $W_{it} \in \{0, 1\}$ is the binary treatment
- $N_c$ never-treated units (control)
- $N_t$ treated units after time $T_{pre}$
**Steps**:
1. Find unit weights $\hat{w}^{sdid}$ such that $\sum_{i = 1}^{N_c} \hat{w}_i^{sdid} Y_{it} \approx N_t^{-1} \sum_{i = N_c + 1}^N Y_{it} \forall t = 1, \dots, T_{pre}$ (i.e., pre-treatment trends in outcome of the treated similar to those of control units) (similar to SC).
2. Find time weights $\hat{\lambda}_t$ such that we have a balanced window (i.e., posttreatment outcomes for control units differ consistently from their weighted average pretreatment outcomes).
3. Estimate the average causal effect of treatment
$$
(\hat{\tau}^{sdid}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta} \{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \alpha_i - \beta_ t - W_{it} \tau)^2 \hat{w}_i^{sdid} \hat{\lambda}_t^{sdid} \}
$$
Better than DiD estimator because $\tau^{did}$ does not consider time or unit weights
$$
(\hat{\tau}^{did}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta} \{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \alpha_i - \beta_ t - W_{it} \tau)^2 \}
$$
Better than SC estimator because $\tau^{sc}$ lacks unit fixed effete and time weights
$$
(\hat{\tau}^{sc}, \hat{\mu}, \hat{\beta}) = \arg \min_{\tau, \mu, \beta} \{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \beta_ t - W_{it} \tau)^2 \hat{w}_i^{sdid} \}
$$
+-------------------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------+
| | **DID** | **SC** | **SDID** |
+===============================+====================================================================+=================================================================================+====================================================================================================+
| **Primary Assumption** | Absence of intervention leads to parallel evolution across states. | Reweights unexposed states to match pre-intervention outcomes of treated state. | Reweights control units to ensure a parallel time trend with the treated pre-intervention trend. |
+-------------------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------+
| **Reliability Concern** | Can be unreliable when pre-intervention trends aren't parallel. | Accounts for non-parallel pre-intervention trends by reweighting. | Uses reweighting to adjust for non-parallel pre-intervention trends. |
+-------------------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------+
| **Treatment of Time Periods** | All pre-treatment periods are given equal weight. | Doesn't specifically emphasize equal weight for pre-treatment periods. | Focuses only on a subset of pre-intervention time periods, selected based on historical outcomes. |
+-------------------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------+
| **Goal with Reweighting** | N/A (doesn't use reweighting). | To match treated state as closely as possible before the intervention. | Make trends of control units parallel (not necessarily identical) to the treated pre-intervention. |
+-------------------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------+
Alternatively, think of our parameter of interest as:
$$
\hat{\tau} = \hat{\delta}_t - \sum_{i = 1}^{N_c} \hat{w}_i \hat{\delta}_i
$$
where $\hat{\delta}_t = \frac{1}{N_t} \sum_{i = N_c + 1}^N \hat{\delta}_i$
+------------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+
| Method | Sample Weight | Adjusted outcomes ($\hat{\delta}_i$) | Interpretation |
+============+=======================================================+=============================================================================================================+==================================================================================+
| SC | $\hat{w}^{sc} = \min_{w \in R}l_{unit}(w)$ | $\frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it}$ | Unweighted treatment period averages |
+------------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+
| DID | $\hat{w}_i^{did} = N_c^{-1}$ | $\frac{1}{T_{post}} \sum_{t = T_{pre}+ 1}^T Y_{it} - \frac{1}{T_{pre}} \sum_{t = 1}^{T_{pre}}Y_{it}$ | Unweighted differences between average treatment period and pretreatment outcome |
+------------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+
| SDID | $(\hat{w}_0, \hat{w}^{sdid}) = \min l_{unit}(w_0, w)$ | $\frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it} - \sum_{t = 1}^{T_{pre}} \hat{\lambda}_t^{sdid} Y_{it}$ | Weighted differences between average treatment period and pretreatment outcome |
+------------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+
- The SDID estimator uses weights:
- Makes two-way fixed effect regression "local."
- Emphasizes units similar in their past to treated units.
- Prioritizes periods resembling treated periods.
- Benefits of this localization:
1. **Robustness**: Using similar units and periods boosts estimator's robustness.
2. **Improved Precision**: Weights can eliminate predictable outcome components.
- The SEs of SDID are smaller than those of SC and DID
- Caveat: If there's minor systematic heterogeneity in outcomes, unequal weighting might reduce precision compared to standard DID.
- Weight Design:
- **Unit Weights**: Makes average outcome for treated units roughly parallel to the weighted average for control units.
- **Time Weights**: Ensures posttreatment outcomes for control units differ consistently from their weighted average pretreatment outcomes.
- Weights enhance DID's plausibility:
- Raw data often lacks parallel time trends for treated/control units.
- Similar techniques (e.g., adjusting for covariates or selecting specific time periods) were used before [@callaway2021difference].
- SDID automates this process, applying a similar logic to weight both units and time periods.
- Time Weights in SDID:
- Removes bias and boosts precision (i.e., minimizes the influence of time periods vastly different from posttreatment periods).
- Argument for Unit Fixed Effects:
1. **Flexibility**: Increases model flexibility and thereby bolsters robustness.
2. **Enhanced Precision**: Unit fixed effects explain a significant portion of outcome variation.
- SC Weighting & Unit Fixed Effects:
- Under certain conditions, SC weighting can inherently account for unit fixed effects.
- For example, when the weighted average outcome for control units in pretreatment is the same as that of the treated units. (unlikely in reality)
- The use of unit fixed effect in synthetic control regression (i.e., synthetic control with intercept) was proposed before in @doudchenko2016balancing and @ferman2021synthetic (called DIFP)
More details on application
1. Choose unit weights
- Regularization Parameter:
- Equal to the size of a typical one-period outcome change for control units in the pre-period, then multiplied by a scaling factor [@arkhangelsky2021synthetic, p. 4092].
- Relation to SC Weights:
- SDID weights are similar to those used in [@abadie2010synthetic] except two distinctions:
1. Inclusion of an Intercept Term:
- The weights in SynthDiD do not necessarily make the control pre-trends perfectly match the treated trends, just make them parallel.
- This flexibility comes from the use of unit fixed effects, which can absorb any consistent differences between units.
2. Regularization Penalty:
- Adopted from @doudchenko2016balancing .
- Enhances the dispersion and ensures the uniqueness of the weights.
- DID weights are identical to those used in [@abadie2010synthetic] without intercept and regularization penalty and 1 treated unit.
2. Choose time weights
- Also include an intercept term, but no regularization (because correlated observations within time periods for the same unit is plausible, but not across units within the same period).
**Note**: To account for time-varying variables in the weights, one can use the residuals of the regression of the observed outcome on these time-varying variables, instead of the observed outcomes themselves ($Y_{it}^{res} = Y_{it} - X_{it} \hat{\beta}$, where $\hat{\beta}$ come from $Y = \beta X_{it}$).
The SDID method can account for systematic effects, often referred to as unit effects or unit heterogeneity, which influence treatment assignment (i.e., when treatment assignment is correlated with these systematic effects). Consequently, it provides unbiased estimates, especially valuable when there's a suspicion that the treatment might be influenced by persistent, unit-specific attributes.
Even in cases where we have completely random assignment, SDID, DiD, and SC are unbiased, but SynthDiD has the smallest SE.
------------------------------------------------------------------------
## Application
**SDID Algorithm**
1. Compute regularization parameter $\zeta$
$$
\zeta = (N_{t}T_{post})^{1/4} \hat{\sigma}
$$
where
$$
\hat{\sigma}^2 = \frac{1}{N_c(T_{pre}- 1)} \sum_{i = 1}^{N_c} \sum_{t = 1}^{T_{re}-1}(\Delta_{it} - \hat{\Delta})^2
$$
- $\Delta_{it} = Y_{i(t + 1)} - Y_{it}$
- $\hat{\Delta} = \frac{1}{N_c(T_{pre} - 1)}\sum_{i = 1}^{N_c}\sum_{t = 1}^{T_{pre}-1} \Delta_{it}$
2. Compute unit weights $\hat{w}^{sdid}$
$$
(\hat{w}_0, \hat{w}^{sidid}) = \arg \min_{w_0 \in R, w \in \Omega}l_{unit}(w_0, w)
$$
where
- $l_{unit} (w_0, w) = \sum_{t = 1}^{T_{pre}}(w_0 + \sum_{i = 1}^{N_c}w_i Y_{it} - \frac{1}{N_t}\sum_{i = N_c + 1}^NY_{it})^2 + \zeta^2 T_{pre}||w||_2^2$
- $\Omega = \{w \in R_+^N: \sum_{i = 1}^{N_c} w_i = 1, w_i = N_t^{-1} \forall i = N_c + 1, \dots, N \}$
3. Compute time weights $\hat{\lambda}^{sdid}$
$$
(\hat{\lambda}_0 , \hat{\lambda}^{sdid}) = \arg \min_{\lambda_0 \in R, \lambda \in \Lambda} l_{time}(\lambda_0, \lambda)
$$
where
- $l_{time} (\lambda_0, \lambda) = \sum_{i = 1}^{N_c}(\lambda_0 + \sum_{t = 1}^{T_{pre}} \lambda_t Y_{it} - \frac{1}{T_{post}} \sum_{t = T_{pre} + 1}^T Y_{it})^2$
- $\Lambda = \{ \lambda \in R_+^T: \sum_{t = 1}^{T_{pre}} \lambda_t = 1, \lambda_t = T_{post}^{-1} \forall t = T_{pre} + 1, \dots, T\}$
4. Compute the SDID estimator
$$
(\hat{\tau}^{sdid}, \hat{\mu}, \hat{\alpha}, \hat{\beta}) = \arg \min_{\tau, \mu, \alpha, \beta}\{ \sum_{i = 1}^N \sum_{t = 1}^T (Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau)^2 \hat{w}_i^{sdid}\hat{\lambda}_t^{sdid}
$$
------------------------------------------------------------------------
**SE Estimation**
- Under certain assumptions (errors, samples, and interaction properties between time and unit fixed effects) detailed in [@arkhangelsky2019synthetic, p. 4107], SDID is asymptotically normal and zero-centered
- Using its asymptotic variance, conventional confidence intervals can be applied to SDID.
$$
\tau \in \hat{\tau}^{sdid} \pm z_{\alpha/2}\sqrt{\hat{V}_\tau}
$$
- There are 3 approaches for variance estimation in confidence intervals:
1. **Clustered Bootstrap [@efron1992bootstrap]:**
- Independently resample units.
- Advantages: Simple to use; robust performance in large panels due to natural approach to inference with panel data where observations of the same unit might be correlated.
- Disadvantage: Computationally expensive.
2. **Jackknife [@miller1974jackknife]:**
- Applied to weighted SDID regression with fixed weights.
- Generally conservative and precise when treated and control units are sufficiently similar.
- Not recommended for some methods, like the SC estimator, due to potential biases.
- Appropriate for jackknifing DID without random weights.
3. **Placebo Variance Estimation:**
- Can used in cases with only one treated unit or large panels.
- Placebo evaluations swap out the treated unit for untreated ones to estimate noise.
- Relies on homoskedasticity across units.
- Depends on homoskedasticity across units. It hinges on the empirical distribution of residuals from placebo estimators on control units.
- The validity of the placebo method hinges on consistent noise distribution across units. One treated unit makes nonparametric variance estimation difficult, necessitating homoskedasticity for feasible inference. Detailed analysis available in @conley2011inference.
All algorithms are from @arkhangelsky2021synthetic, p. 4109:
> **Bootstrap Variance Estimation**
>
> 1. For each $b$ from $1 \to B$:
>
> - Sample $N$ rows from $(\mathbf{Y}, \mathbf{W})$ to get ($\mathbf{Y}^{(b)}, \mathbf{W}^{(b)}$) with replacement.
>
> - If the sample lacks treated or control units, resample.
>
> - Calculate $\tau^{(b)}$ using ($\mathbf{Y}^{(b)}, \mathbf{W}^{(b)}$).
>
> 2. Calculate variance: $\hat{V}_\tau = \frac{1}{B} \sum_{b = 1}^B (\hat{\tau}^{b} - \frac{1}{B} \sum_{b = 1}^B \hat{\tau}^b)^2$
> **Jackknife Variance Estimation**
>
> 1. For each $i$ from $1 \to N$:
> 1. Calculate $\hat{\tau}^{(-i)}$: $\arg\min_{\tau, \{\alpha_j, \beta_t\}} \sum_{j \neq, i, t}(\mathbf{Y}_{jt} - \alpha_j - \beta_t - \tau \mathbf{W}_{it})^2 \hat{w}_j \hat{\lambda}_t$
> 2. Calculate: $\hat{V}_{\tau} = (N - 1) N^{-1} \sum_{i = 1}^N (\hat{\tau}^{(-i)} - \hat{\tau})^2$
> **Placebo Variance Estimation**
>
> 1. For each $b$ from $1 \to B$
> 1. Sample $N_t$ out of $N_c$ without replacement to get the "placebo" treatment
> 2. Construct a placebo treatment matrix $\mathbf{W}_c^b$ for the controls
> 3. Calculate $\hat{\tau}$ based on $(\mathbf{Y}_c, \mathbf{W}_c^b)$
> 2. Calculate $\hat{V}_\tau = \frac{1}{B}\sum_{b = 1}^B (\hat{\tau}^b - \frac{1}{B} \sum_{b = 1}^B \hat{\tau}^b)^2$
### Block Treatment
Code provided by the `synthdid` package
```{r}
library(synthdid)
library(tidyverse)
# Estimate the effect of California Proposition 99 on cigarette consumption
data('california_prop99')
setup = synthdid::panel.matrices(synthdid::california_prop99)
tau.hat = synthdid::synthdid_estimate(setup$Y, setup$N0, setup$T0)
# se = sqrt(vcov(tau.hat, method = 'placebo'))
plot(tau.hat) + causalverse::ama_theme()
```
```{r compare between different estimators, collapse=TRUE}
setup = synthdid::panel.matrices(synthdid::california_prop99)
# Run for specific estimators
results_selected = causalverse::panel_estimate(setup,
selected_estimators = c("synthdid", "did", "sc"))
results_selected
# to access more details in the estimate object
summary(results_selected$did$estimate)
causalverse::process_panel_estimate(results_selected)
```
### Staggered Adoption
To apply to staggered adoption settings using the SDID estimator (see examples in @arkhangelsky2021synthetic, p. 4115 similar to @ben2022synthetic), we can:
1. Apply the SDID estimator repeatedly, once for every adoption date.
2. Using @ben2022synthetic 's method, form matrices for each adoption date. Apply SDID and average based on treated unit/time-period fractions.
3. Create multiple samples by splitting the data up by time periods. Each sample should have a consistent adoption date.
For a formal note on this special case, see @porreca2022synthetic. It compares the outcomes from using SynthDiD with those from other estimators:
- Two-Way Fixed Effects (TWFE),
- The group time average treatment effect estimator from @callaway2021difference,
- The partially pooled synthetic control method estimator from @ben2021augmented, in a staggered treatment adoption context.
```{=html}
<!-- -->
```
- The findings reveal that SynthDiD produces a different estimate of the average treatment effect compared to the other methods.
- Simulation results suggest that these differences could be due to the SynthDiD's data generating process assumption (a latent factor model) aligning more closely with the actual data than the additive fixed effects model assumed by traditional DiD methods.
To explore heterogeneity of treatment effect, we can do subgroup analysis [@berman2022value, p. 1092]
+--------------------------------------------------------------------+-------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| **Method** | **Advantages** | **Disadvantages** | **Procedure** |
+====================================================================+===================================================================+======================================================================================================================================================+================================================================================================+
| Split Data into Subsets | Compares treated units to control units within the same subgroup. | Each subset uses a different synthetic control, making it challenging to compare effects across subgroups. | 1. Split the data into separate subsets for each subgroup. |
| | | | 2. Compute synthetic DID effects for each subset. |
+--------------------------------------------------------------------+-------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| Control Group Comprising All Non-adopters | Control weights match pretrends well for each treated subgroup. | Each control unit receives a different weight for each treatment subgroup, making it difficult to compare results due to varying synthetic controls. | 1. Use a control group consisting of all non-adopters in each balanced panel cohort analysis. |
| | | | 2. Switch treatment units to the subgroup being analyzed. |
| | | | 3. Perform `synthdid` analysis. |
+--------------------------------------------------------------------+-------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| Use All Data to Estimate Synthetic Control Weights **(recommend)** | All units have the same synthetic control. | Pretrend match may not be as accurate since it aims to match the average outcome of all treated units, not just a specific subgroup. | 1. Use all the data to estimate the synthetic DID control weights. |
| | | | 2. Compute treatment effects using only the treated subgroup units as the treatment units. |
+--------------------------------------------------------------------+-------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
```{r}
library(tidyverse)
df <- fixest::base_stagg |>
dplyr::mutate(treatvar = if_else(time_to_treatment >= 0, 1, 0)) |>
dplyr::mutate(treatvar = as.integer(if_else(year_treated > (5 + 2), 0, treatvar)))
est <- causalverse::synthdid_est_ate(
data = df,
adoption_cohorts = 5:7,
lags = 2,
leads = 2,
time_var = "year",
unit_id_var = "id",
treated_period_var = "year_treated",
treat_stat_var = "treatvar",
outcome_var = "y"
)
data.frame(
Period = names(est$TE_mean_w),
ATE = est$TE_mean_w,
SE = est$SE_mean_w
) |>
causalverse::nice_tab()
causalverse::synthdid_plot_ate(est)
```
```{r synthdid subgroup analysis}
est_sub <- causalverse::synthdid_est_ate(
data = df,
adoption_cohorts = 5:7,
lags = 2,
leads = 2,
time_var = "year",
unit_id_var = "id",
treated_period_var = "year_treated",
treat_stat_var = "treatvar",
outcome_var = "y",
# a vector of subgroup id (from unit id)
subgroup = c(
# some are treated
"11", "30", "49" ,
# some are control within this period
"20", "25", "21")
)
data.frame(
Period = names(est_sub$TE_mean_w),
ATE = est_sub$TE_mean_w,
SE = est_sub$SE_mean_w
) |>
causalverse::nice_tab()
causalverse::synthdid_plot_ate(est)
```
Plot different estimators
```{r, eval=FALSE}
library(causalverse)
methods <- c("synthdid", "did", "sc", "sc_ridge", "difp", "difp_ridge")
estimates <- lapply(methods, function(method) {
synthdid_est_ate(
data = df,
adoption_cohorts = 5:7,
lags = 2,
leads = 2,
time_var = "year",
unit_id_var = "id",
treated_period_var = "year_treated",
treat_stat_var = "treatvar",
outcome_var = "y",
method = method
)
})
plots <- lapply(seq_along(estimates), function(i) {
causalverse::synthdid_plot_ate(estimates[[i]],
title = methods[i],
theme = causalverse::ama_theme(base_size = 6))
})
gridExtra::grid.arrange(grobs = plots, ncol = 2)
```