To begin, I want to address some efficiency concerns with your code. First, you already instantiated the proper Treatment
and Post
indicators to a perform the "classical" difference-in-differences (DiD) analysis. It is not necessary to include the additive terms and also manually calculate the interaction inside of I()
. Simply interact Treatment
and Post
and R will estimate the constituent terms of the interaction—for free. You could also drop effects = "individual"
since the default behavior of plm()
is to introduce individual effects. Thus, you can achieve the same results with the following:
OLS <- lm(log(Lum) ~ Treatment*Post + Unemp + Illiteracy, data = Long)
FE <- plm(log(Lum) ~ Treatment*Post + Unemp + Illiteracy, index = c("Municipality"), data = Long, model= "within")
My issue now is that if I perform a DiD regression on luminosity, I get identical coefficients regardless of whether I include individual FE or not.
It is entirely plausible to observe similar results when your panel consists of two well-defined treatment/control groups and two well-defined before/after periods. Thus, you can reestimate the former equation using municipality fixed effects and the interaction term (i.e., Treatment*Post
) in the same model. This should yield identical DiD coefficients.
It should be noted, however, that the two equations will not produce similar results once you deviate from this setting. For example, suppose treatment adoption was staggered across municipalities; some municipalities get treated in earlier time periods while others get treated in later ones. Or, suppose a subset of treated municipalities withdraw from treatment permanently while others do not. Or, suppose municipalities move into and out of a "treated" status multiple times. Or, suppose you observe multiple treatment groups and each group receives a different "dose" (intensity) of treatment. You can clearly see the many departures from the traditional setting. In more complex settings, there is no guarantee the results from these two models will be similar. But I digress.
To help with your intuition, I encourage you to estimate the two models below. The former model uses lm()
to estimate your DiD equation with municipality fixed effects; the latter uses plm()
to estimate the same equation but uses the within transformation. Because treatment is standardized, both will produce identical estimates:
OLS <- lm(log(Lum) ~ as.factor(Municipality) + Treatment*Post + Unemp + Illiteracy, data = Long)
FE <- plm(log(Lum) ~ as.factor(Municipality) + Treatment*Post + Unemp + Illiteracy, index = c("Municipality"), data = Long, model= "within")
Note, as.factor(Municipality)
results in the estimation of dummies for all municipalities. This is algebraically equivalent to estimation is deviations from means. In the former equation using lm()
, Treatment
will be dropped, showing up as NA
in your output summary; it is collinear with the fixed effects. This should not concern you. Its removal does not affect the coefficient on your interaction term.
In the latter equation, the inclusion of these dummies is redundant, as you already specified model = within
. Your model summary using plm()
will be much cleaner. In fact, the output from summary(FE)
will omit Treatment
for you. Either way, your estimates will be similar.
Addressing concerns in the comments section...
So as I understand it, my FE estimation is a generalization of my OLS estimation?
Correct.
But you write that it's not surprising that coefficients are similar. However, mine are identical, does that make a difference?
No.
Also, the fact that FE & OLS estimates differ if a number of observations are left out randomly confuses me. Is it because then the number of treated months is not identical for all units/treatment does not start at the same time for every municipality?
Yes. You indicated in your post that you randomly removed 10% of your observations (rows). This could remove relevant pre- or post-exposure months. In your full panel, all municipalities might have two post-treatment months. In the abridged panel, a subset of municipalities might have only one—or none. Deliberately discarding observations at random creates an unbalanced panel, and could most certainly impact your point estimates.
Other thoughts...
Your model also generalizes to an equation which includes dummies for all units and all time periods. In lm()
, simply include a full set of dummies for municipalities and full set dummies for months.
The following equations should produce similar results as well:
OLS <- lm(log(Lum) ~ as.factor(Municipality) + as.factor(Month) + Treatment*Post + Unemp + Illiteracy, data = Long)
FE <- plm(log(Lum) ~ Treatment*Post + Unemp + Illiteracy, index = c("Municipality", "Month"), data = Long, model= "within", effects = "twoways")
Suppose you have 30 municipalities observed over 36 months. The former model using lm()
results in the estimation of 29 separate municipality effects and 35 separate month effects. Here, Treatment
is dropped as it is collinear with the municipality fixed effects; Post
is dropped as it is collinear with the month fixed effects.
The latter model using plm()
does this is one shot if you include the unit and time index; this will require you to create a month variable in R (e.g., Jan-2018, Feb-2018, etc.). If you add effects = "twoways"
to your model then a summary will only display the coefficient on your interaction term with the constituent terms removed for free.
Try out these different specifications and see if they give you the same DiD estimate. They should!