-
Notifications
You must be signed in to change notification settings - Fork 37
Matched Set Objects
First, create a smaller subset of the data to make things a little easier to see and work with for the purposes of these examples.
uid <-unique(dem$wbcode2)[1:10]
subdem <- dem[dem$wbcode2 %in% uid, ]
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem)
We will use the PanelMatch
function with refinement.method
set to none
to obtain a PanelMatch
object, from which we will extract a matched.set
object.
PM.results <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2",
treatment = "dem", refinement.method = "none",
data = subdem, match.missing = TRUE,
qoi = "att" ,outcome.var = "y",
lead = 0, forbid.treatment.reversal = FALSE)
#Extract the matched.set object
msets <- PM.results$att
PanelMatch
returns an S3 object of the PanelMatch
class. These objects are just lists with some additional attributes. Here, we will focus on one element contained within PanelMatch
objects: matched.set
objects. Within the PanelMatch
object, this element is always named either att
or atc
. When qoi = ate
, then there are two matched.set
objects included in the resulting PanelMatch
call. Specifically, there will be two matched sets named att
and atc
, respectively.
In implementation, the matched.set
is just a named list with some added attributes (lag, names of treatment, unit, and time variables) and a structured name scheme. Each entry in the list is a vector containing the unit ids of control units that are in a matched set. Additionally, each entry corresponds to a time/unit id pair (the unit id of a treated unit and the time at which treatment occurred). This is reflected in the names of each element of the list, as the name scheme [id varable]
.[time variable]
is used.
Matched set objects are implemented as lists, but the default printing behavior resembles that of a data frame. One can toggle a verbose
option on the print
method to print as a list and also display a less summarized version of the matched set data.
names(msets)
[1] "4.1992" "4.1997" "6.1973" "6.1983" "7.1991" "12.1992" "13.2003" "7.1998"
#data frame printing view: useful as a summary view with large data sets
print(msets)
wbcode2 year matched.set.size
1 4 1992 2
2 4 1997 1
3 6 1973 1
4 6 1983 2
5 7 1991 4
6 12 1992 2
7 13 2003 2
8 7 1998 0
# first column is unit id variable, second is time variable, and
# third is the number of controls in that matched set
# prints as a list, shows all data at once
print(msets, verbose = TRUE)
$`4.1992`
[1] "3" "13"
attr(,"weights")
3 13
0.5 0.5
$`4.1997`
[1] "7"
attr(,"weights")
7
1
$`6.1973`
[1] "13"
attr(,"weights")
13
1
$`6.1983`
[1] "4" "13"
attr(,"weights")
4 13
0.5 0.5
$`7.1991`
[1] "3" "4" "12" "13"
attr(,"weights")
3 4 12 13
0.25 0.25 0.25 0.25
$`12.1992`
[1] "3" "13"
attr(,"weights")
3 13
0.5 0.5
$`13.2003`
[1] "3" "12"
attr(,"weights")
3 12
0.5 0.5
$`7.1998`
character(0)
attr(,"lag")
[1] 4
attr(,"t.var")
[1] "year"
attr(,"id.var")
[1] "wbcode2"
attr(,"treatment.var")
[1] "dem"
attr(,"refinement.method")
[1] "none"
attr(,"match.missing")
[1] TRUE
The '[' and '[[' operators are implemented and should work intuitively.
Using '[' returns a subsetted matched.set
object (list). The additional attributes will be copied and transferred as well with the custom operator. Note how, by default, it prints like the full form of the matched.set
. Using '[[' will return the unit ids of the control units in the specified matched set:
Since matched.set
objects are just lists with attributes, you can expect the [
and [[
functions to work similarly to how they would with a list. So, for instance, users can extract information about matched sets using numerical indices or by taking advantage of the naming scheme.
msets[1]
wbcode2 year matched.set.size
1 4 1992 2
#prints the control units in this matched set
msets[[1]]
[1] "3" "13"
attr(,"weights")
3 13
0.5 0.5
msets["4.1992"] #equivalent to msets[1]
wbcode2 year matched.set.size
1 4 1992 2
msets[["4.1992"]] #equivalent to msets[[1]]
[1] "3" "13"
attr(,"weights")
3 13
0.5 0.5
Calling plot
on a matched.set
object will display a histogram of the sizes of the matched sets. By default, the number of empty matched sets (treated unit/time id pairs with no suitable controls for a match) is noted with a vertical bar at x = 0. One can include empty sets in the histogram by setting the include.empty.sets
argument to TRUE
plot(msets, xlim = c(0, 4))
The
summary
function provides a variety of information about the sizes of matched sets, number of empty sets, lag size, and also a data frame with some useful summary overview information. This "overview" data frame from the summary
function is actually what gets printed by default when calling print
on a matched.set
object, so if one wanted to interact with that data.frame
object, you could do that with the overview
item from summary
. The summary
function also has an option to print only the overview
data frame. Toggle this by setting verbose = FALSE
print(summary(msets))
$overview
wbcode2 year matched.set.size
1 4 1992 2
2 4 1997 1
3 6 1973 1
4 6 1983 2
5 7 1991 4
6 12 1992 2
7 13 2003 2
8 7 1998 0
$set.size.summary
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 1.00 2.00 1.75 2.00 4.00
$number.of.treated.units
[1] 8
$num.units.empty.set
[1] 1
$lag
[1] 4
print(summary(msets, verbose = FALSE))
wbcode2 year matched.set.size
1 4 1992 2
2 4 1997 1
3 6 1973 1
4 6 1983 2
5 7 1991 4
6 12 1992 2
7 13 2003 2
8 7 1998 0
Passing a matched set (one treated unit and its corresponding set of controls) to the DisplayTreatment
function will visually highlight the lag window histories used to create the matched set. There is also an option to only display units from the matched set (and the treated unit), which can be achieved by setting show.set.only
to TRUE
.
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem, matched.set = msets[1])
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem',
data = subdem, matched.set = msets[1], show.set.only = TRUE, y.size = 15, x.size = 13)