difference-in-differences - multiple time periods analyzing results

Question

I am tring to perform this resgression: $$ y_{it} = \beta_{0} + \beta_{1}\text{Treat}_{i} + \sum_{j \neq k} \lambda_{j} \text{Year}_{t=j} + \sum_{j \neq k} \delta_j \left( \text{Treat}_i \cdot \text{Year}_{t=j} \right) + X_{it}'\gamma + \epsilon_{it}. $$

time periods t=1,2,...,k,...,T periods where the treatment happens between k and k+1 (so time k is my last pre-treatment period)

Now I have few questions:

What is the meaning of the coefficient b0 and b1?
I know that the meaning in regular pre-post regression but not sure what is the meaning in this case.
Just to be sure, there is no way I can sum what is the total influence of the treatment over Y? I can just tell what is the treatment effect in a specific year? I'm trying to understand how can I analyze the results I got from the regression.
What is the meaning of gamma? how the specific characteristic I added effected Y?

Thank you!

time periods t=1,2,...,k,...,T periods where the treatment happens between k and k+1 (so time k is my last pre-treatment period) — XYZ, May 02 '20 at 10:28
You cannot have time effects $\lambda_t$ for all time periods (when you also include the constant $\beta_0$). This is standard dummy variable identification problem. The way you construct the sum $\sum_{j\not =k}\lambda_j Year_{t=j}$ then basically implies that you are using $k$ as a reference year. That seems odd. But maybe it is what you want. — Jesper for President, May 02 '20 at 10:36
Anyway if you ignore $X_{it}$ and let $t=k$ then $\mathbb E[y_{ik}] = \beta_0$ for the untreated and $\mathbb E[y_{ik}] = \beta_0 + \beta_1$ for treated. You can the say something like for "comparing a treated with an untreated of same characteristics $x_{it}$ the parameter $\beta_1$ measure the difference in expected outcome in year $k$ just before treatment" — Jesper for President, May 02 '20 at 10:51
Why cant I include β0? in regular pre-post regression its meaning is - "control before treatment". here it doesn't have any meaning? — XYZ, May 02 '20 at 11:09
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/107512/discussion-between-jesper-for-president-and-ytz). — Jesper for President, May 02 '20 at 14:35
@Jesper for President This relates to the following [post](https://stats.stackexchange.com/questions/459936/diff-and-diff-using-time-dummies-instead-of-post/461279?noredirect=1#comment857966_461279). The original equation wasn't quite clear so I attempted to clean it up. I welcome any adjustments to make this formulation more digestible for the OP and others. See my answer. — Thomas Bilach, May 02 '20 at 17:05

Thomas Bilach · Accepted Answer · 2020-05-10T01:23:33.227

1

These are follow-up questions from this post. Reproducing the OP's equation below:

$$ y_{it} = \beta_{0} + \beta_{1}\text{Treat}_{i} + \sum_{j \neq k} \lambda_{j} \text{Year}_{t=j} + \sum_{j \neq k} \delta_j \left( \text{Treat}_i \times \text{Year}_{t=j} \right) + X_{it}'\gamma + \epsilon_{it}, $$

then each $\delta_{j}$ is a separate estimate of the treatment effect for each individual treatment year. As per your post, I assume you consider all $j \leq k$ as your pretreatment epoch. Thus, each coefficient on $\delta_{j}$ is an estimate of the $j$-th additive yearly treatment effect. Due to concerns in the comments, you should be more explicit about what periods you are indexing.

Adding multiple pre- and post-treatment periods is not difficult in this setting. Assume all treated units/entities receive treatment at the same time. Then, let $T_{0} + 1$ be the first time period at which treated units receive the treatment; this does not vary across units. I will also define $D_{it}$ as equal to unity for only those periods when treated units enter into the treatment condition. Note, this is simply the product term in the formulation above. Thus, $\text{Treat}_i \times \text{Year}_{t} = D_{it}$. For the control group, we have the following

$$ D^{C} = \begin{Bmatrix} D_{it} = 0 \hspace{3pt} \forall \hspace{3pt} t \end{Bmatrix} $$

which is the baseline history of never receiving the treatment. For the treatment group, the binary treatment indicator is expressed as following:

$$ D^{T} = \begin{cases} D_{it} = 0 \hspace{3pt} \forall \hspace{3pt} t \leq T_{0} \\ D_{it} = 1 \hspace{3pt} \forall \hspace{3pt} t > T_{0} \end{cases} $$

The periods $t \leq T_{0}$ are all the $k$ periods before treatment. The periods $t > T_{0}$ are all the $t$ periods during treatment (i.e., the post-treatment period). Thus, $\delta_{j}$ when $t = j$ is a unique estimate of the treatment effect in that year. I assume you are interested, separately, in each one of those effects.

To clear up any confusion regarding notation, you could also specify your equation more explicitly by labeling the limits of summation. Suppose you have yearly data on countries $i$ from 1980 to 2020. Now, say some population-level health intervention affects some countries in 2010, but not others. The intervention is effective beginning in 2010 and remains in place indefinitely. If you want to investigate effects in each individual exposure year, then the classical difference-in-differences equation generalizes to the following:

$$ y_{it} = \beta_{0} + \beta_{1}\text{Treat}_{i} + \sum_{j = 2010}^{2020} \lambda_{j} \text{Year}_{t=j} + \sum_{j = 2010}^{2020} \delta_j \left( \text{Treat}_i \times \text{Year}_{t=j} \right) + X_{it}'\gamma + \epsilon_{it}, $$

where the treatment dummy is interacted—separately—with a post-intervention year (time) dummy. This results in a separate main effect for each post-treatment year, and a separate estimate of the treatment effect for each exposure year. All $t$ periods before 2010 serve as the pre-treatment epoch, and remain coded 0 to reflect that reality.

What is the meaning of the coefficient $\beta_{0}$ and $\beta_{1}$? I know that the meaning in regular pre-post regression but not sure what is the meaning in this case.

This is still an interaction model, even with multiple pre- and post-treatment periods. We interpret the estimate of $\beta_{0}$ as the mean of your outcome for the control group in the years before treatment is adopted. Your estimate of $\beta_{1}$ is the expected mean difference in your outcome between treatment and control groups in the pre-treatment period. In your case, this is when all $\text{Year}_{t}$ periods are equal to 0 (i.e., all $k$ periods before treatment). This can viewed as the "baseline difference" in your outcome between the two groups. The estimated coefficient(s) associated with your interaction term(s) should be your focus.

Just to be sure, there is no way I can sum what is the total influence of the treatment over Y? I can just tell what is the treatment effect in a specific year? I'm trying to understand how can I analyze the results I got from the regression.

In this case, interact the treatment variable with one post-treatment indicator indexing all post-treatment periods, irrespective of the unit's group status. In the foregoing equation, you are investigating treatment effects in each year individually. It would help if you posted your output as we are only working with abstract equations.

edited May 10 '20 at 01:23

answered May 02 '20 at 17:02

Thomas Bilach

4,732
2
6
25

How can I interpret the meaning of gamma (X coefficient) in this case? the value of gamma in the regression I'm running is 0.021 (Statistical significance of p<0.01) – XYZ May 03 '20 at 13:46
$\gamma$ is the coefficient(s) on your covariate(s). These should be your time-varying control variables. What other variable(s) did you adjust for? – Thomas Bilach May 03 '20 at 14:01
In my regression this variable stands for years of education for each individual i – XYZ May 03 '20 at 14:04
Is it a continuous measure? What are the range of values? Also, what is your outcome? We need the *units of measurement*. Please provide as much information as possible. In the future, you should post your output. – Thomas Bilach May 03 '20 at 14:19
Y is binary variable: 1 or 0. it is a continuous measure. – XYZ May 03 '20 at 21:17
If Y is binary variable and I have panel data over 20 different periods of times what is the meaning of each coefficient delta_j? in this case is it OK to use 'lm' command in R? – XYZ May 05 '20 at 09:14
1

Yes. You can run this regression. The use of the `lm()` function with a binary outcome is know as a linear probability model. It will ease the interpretation of the model coefficients. Just make sure you are aware of the potential downsides of using such a model. Remember, you are interacting your treatment dummy with multiple 'post-treatment' year dummies. You can think of *each* $\delta_{j}$ as a *separate estimate* of the *treatment effect* in a given year. Individual estimates may be of substantive interest. – Thomas Bilach May 05 '20 at 17:02
thank you! So just to be sure - if I get that delta_j in period 12 (2 periods after treatment) equal to -0.05 it meant that the chance of y being equal to 1 is decreased in 0.05 during this period? in this case what is more important, the direction of change or its size? – XYZ May 05 '20 at 19:44
1

That is about right. The interpretation is expressed as a probability. As for directionality, what does *theory* tell you about the relationship between $x$ and $y$? I can't answer this for you. As for the individual coefficient, .05 is a rather low probability. This is where you speak to the "practical significance" of the individual estimates. A low $p$-value is nice, but is this important to people (practitioners) interested in your study? This is for you to sell to your audience. – Thomas Bilach May 05 '20 at 22:57
The theory says that the treatment should decrease Y. I would love for help in another topic: what is the best way to present the results of this kind of regression. I have data of 20 different periods so I get a very long table of regression results (20 λj, 20 δj,intercept, β1,γ). what is the best way to present the results? the entire table? a graph that display δj over time?). I'm asking because in regular DID regression with only pre-post periods the results are of the differences. I don't know what to do in this case and couldn't find similar paper. – XYZ May 07 '20 at 15:47
1

To avoid extraneous output, I would only report the coefficients on the interactions of the main treatment indicator with the post-treatment dummies (i.e., $\hat{\delta}_{j}$'s). If you notice effects grow (accumulate) or decline (deteriorate) over time in a meaningful way, then a coefficient plot might tell a better story for readers. I should note, I have seen papers where researchers interact the treatment with *all* pre- and post-treatment dummies. You may have seen this referenced as a "panel event-study design." You could look into that as well! In sum, I would focus on the interactions. – Thomas Bilach May 07 '20 at 16:05
In this discussion - https://stats.stackexchange.com/questions/152735/what-is-the-best-way-to-visualize-difference-in-differences-multi-period-regre. they write that the way to present the results is to plot the averages of the outcome variables for treatment and control group over time. should I also do that? – XYZ May 09 '20 at 09:14
1

Not only should you do that, you *should have* done that already. Plotting the evolution of the group trends *before* policy adoption is how we assess the validity of the design. There should be a visually clear parallelism between the groups prior to treatment. Typically, you should assess the trends—first. Note, a plot of the group averages between treatment and control groups *over time* is different than a *coefficient plot* of *interactions* between a treatment indicator and a series of time dummies. It is more important to ensure parallel trends. But I would do both! – Thomas Bilach May 10 '20 at 22:48
So if: Yi = 1 if person is employed, 0 otherwise. For each period of time I plot the average number of employed from treatment group and average number of employed from control group. If I see the parallel trend in the periods before the treatment it means I can use DID model - parallel trend assumption. Is it addition to the not significant coefficient δj before the treatment? – XYZ May 11 '20 at 20:19
It wasn’t clear in your question that you were working with a binary outcome. The interpretation will be different. You are working with *probabilities*. The short answer to your question is yes. To start, you should assess the relative *proportions* between the two groups over time. I am not sure if you posted a new question related to the common trend assumption, but if you did you should post the link here so others can follow. – Thomas Bilach May 11 '20 at 21:41
Thank you for your help! https://stats.stackexchange.com/questions/465370/diff-and-diff-with-multiple-time-periods-parallel-trend-assumption – XYZ May 12 '20 at 07:45

difference-in-differences - multiple time periods analyzing results

1 Answers1