Difference in Difference - Does the control units need to be similiar?

Question

This is often shown as the setup and simple analysis for a difference-in-difference estimation. Lets say that the control series is significantly smaller than the treated series. Below this means the green line would be greatly shifted downward - but still similar trend as red (the primary assumption of d-i-d). Does the (C-A)-(D-B) still work if the outcome value for the series is not close to each other? It seems like we would need to use percentage change.

Here is an example. The d-i-d estimate is 963. It doesn't make sense though to me to adjust the increase in the treatment market by the change in the control when the sizes are so different. It seems like a better approach would be to look at the growth in control (20% here) and apply that to the treated: 1.2*1530 and then compare this against the actual of 2500. In this case the estimate would be 664.

It seems like we should compare the treated market growth to the % change in the control market because as is, we remove from the treated increase the absolute value of the control and that seems wrong, since the starting base for the control is so much lower. 7 units is 20% increase for the control, but is very small compared to the treated.

A growth rate implies a growth of something (outcome) along time. The DID parameter is obtained from subtracting the A outcome from the C outcome (result 1), subtracting de B outcome from the D outcome (result 2), and then, subtracting result 2 from result 1. Therefore, the unit of measurement needs to be the same for treatment and control (so to compare apples with apples, and not apples with oranges). Why do you think it wouldn't work if the outcome value for the series is not close to each other? Note there is not any calculations involving 'A' and 'D' directly (e.g. 'A-D'). — Andre Silva, Dec 22 '16 at 02:02
I added to the question to show an example of what I was troubled by — B_Miner, Dec 22 '16 at 02:32
You added an example, but did not explain why you think results are odd. Why B3 can't be 963, and why 664 makes more sense than 963? — Andre Silva, Dec 22 '16 at 09:42
It seems like we should compare the treated market growth to the % change in the control market because as is, we remove from the treated increase the absolute value of the control and that seems wrong, since the starting base for the control is so much lower. 7 units is 20% increase for the control, but is very small compared to the treated. Does that explain? — B_Miner, Dec 22 '16 at 13:41
http://stats.stackexchange.com/questions/564/what-is-difference-in-differences — Andre Silva, Dec 22 '16 at 15:08
That shows how a d-i-d is typically defined...but does it speak to my question? — B_Miner, Dec 22 '16 at 15:20
The part which the answer says that control and treatment before intervention needs to have similar tendency over time (i.e. the same growth rates). If it has, the amount/magnitude by which control and tratment are separated in outcome values in the pre intervention phase is not a problem in the calculation of DiD parameter. — Andre Silva, Dec 22 '16 at 15:49
Not sure - parallel pre-intervention trends is the main assumption of the technique, but we can easily assume a situation where two series are following parallel trends but are shifted from each other in a large way (large beta_2) — B_Miner, Dec 22 '16 at 15:57
Yes. And..? Why would a large beta 2 imply unit of measurement to be different between treatment and control? But probably I did not understand your question, so better wait to see if someone can help you more than I did. Good luck. — Andre Silva, Dec 22 '16 at 16:12

Andy · Accepted Answer · 2016-12-23T11:08:07.503

4

I think the problem is that the "parallel trends" (or parallel paths) assumption here was confused with a "parallel growth" assumption. Two lines are parallel over time when the distance between the two lines remains the same at each point in time. But this is not true when you compute 1.2*1530 (you can easily visualize this).
Instead, the control group increases their outcome by 7 in the second period by going from 35 to 42. The outcome of the treated moves in a parallel fashion only if they also increase their outcome by 7. So the point at the end of the dashed line in your graph for the treated, i.e. the unobserved counterfactual outcome for the treated in the absence of the treatment, should have value 1537. Going from this counterfactual point to the observed point C then yields the solution, 2500 - 1537 = 963.

You raise two very good points though with this question:

Even though we care about parallel trends in a diff-in-diff setting and not so much about the initial difference between the treatment and control groups, this assumption is sensitive to functional form. Whilst the outcome series 35-42 and 1530-1537 are parallel in levels they will NOT be parallel in logs. This leaves scope for people to fish for results. Don't like your diff-in-diff graph? Try logs!
In the very basic specification above the following is not much of a problem. But if you have multiple periods and you want to control for time trends by including dummies and a parametric linear time trend which is specific to the treatment and control groups, this changes the underlying assumptions of the diff-in-diff estimator. See this World Bank post by Jed Friedman and the corresponding paper he refers to. Admittedly this point is a little off-topic but was fitting with the "parallel trends" vs "parallel growth" issue.

edited Dec 23 '16 at 11:08

answered Dec 22 '16 at 18:59

Andy

18,070
20
77
100

Hi Andy. So, when we say the D-I-D assumption is parallel trends, we are talking about the same slope (rise/run) and if two markets (or states or whatever) instead have the same growth rate (percentage change) these would not be candidates for D-i-D? – B_Miner Dec 23 '16 at 01:08
1

That's right. This is what the linked paper is about though: developing a class of DiD estimators that can exploit parallel growth rather than parallel paths/trends. – Andy Dec 23 '16 at 05:52
Andy - are you familiar with Interrupted Time Series analysis? I took this course : https://www.edx.org/course/policy-analysis-using-interrupted-time-ubcx-itsx-1 and it was stated that there are no parallel trends assumptions since it is being modeled explicitly. Curious if you have used this technique as a superior one to d-i-d? – B_Miner Dec 23 '16 at 15:12
Actually, I never heard of it but perhaps it's just because I'm too stuck in my field :-) could also be that I know it under a different name. Personally I use DiD a lot in my research because it's a tractable method that can be easily understood by policy makers and general audiences. – Andy Dec 23 '16 at 15:59
Andy, if we dont have access to control markets / states / whatever that have parallel trends, but they are growing at similar rates can a DiD still be used, but we need to log the data (natural log)? – B_Miner Jan 04 '17 at 01:42
According to the linked paper, yes. They establish conditions for when parallel growth is sufficient for their difference in differences estimator to work. I have not yet used their method though and it's a very recent paper so not everyone may like it – Andy Jan 04 '17 at 07:33
I was thinking actually even simpler, if there was just a transformation of the response variable to then allow the normal DiD to work? – B_Miner Jan 04 '17 at 20:05
1

Meaning if two markets are growing at 20% each period....cant we simply take the log of the response variables and run a normal D-i-D? – B_Miner Jan 04 '17 at 20:18
1

Ok, now I understand. Yes. In this case the log transformed variable will provide linear paths for the treatment and control groups and the difference-in-differences assumption is satisfied. – Andy Jan 05 '17 at 12:54

Difference in Difference - Does the control units need to be similiar?

1 Answers1