Graphical and Statistical Tests for Robustness of Sharp RD

Question

I'm doing sharp regression discontinuity design with my treatment variable $$ D_i = \begin{cases} 1 \enspace \quad \text{if $x_i \geq \overline{x}$} \\ 0 \quad \text{otherwise} \end{cases} $$ where $\overline{x}$ is the threshold value of my forcing variable. The treatment is given from a certain income level and the outcome $Y$ is some health related status of the patient.

I have done my analysis and gotten pretty good results but now I would like to test the robustness of my results. Perhaps other unobserved factors can affect the outcome (I have many different variables on the patients) or patients could manipulate the threshold value of their income because the threshold is known in advance.

How else can I show robustness of my results? So far I have used polynomial RD and RD with local linear regression and the results are stable. Is there more? Should I use some data visualisation methods?

score 6 · Accepted Answer · answered May 20 '14 at 08:54

You should definitely support your analysis with graphs. This is an integral part of any regression discontinuity design study because consistent estimation of the treatment effect requires that

you have specified the correct functional form (polynomials) on both sides of the threshold (or chosen an appropriate kernel and bandwidth for the non-parametric method)
no other variables jump around the threshold other than the outcome

To test whether you have the right number of polynomials you can divide your data up into bins and include a dummy for each bin in your regression. If your functional form is right, non of those dummies should be significant (see Lee and Lemieux, 2010). If some bin dummies are significant you can then add higher order polynomials until this significance vanishes. In the same paper they also suggest some methods for choosing the appropriate bandwidth for the non-parametric RD. You should also vary the window around the cutoff and if your results are robust then the standard errors should increase but the point estimates should remain the same or at least similar.

In terms of the graphs, it is always nice to construct bins and average the variables within those bins (play around with the bin sizes to convince yourself of the robustness of your results) and plot

the outcome over the forcing variable (there should be a discontinuity at the cutoff)
your controls over the forcing variable (they should not jump around the cutoff)
the density of the forcing variable over the forcing variable (gives you an idea whether individuals manipulated the threshold - if so, you will see a spike in the density just after the threshold)

A nice graph for points 1. and 2. looks something like this: enter image description here

Related to the third point, you can also formally test for a jump in the density of the forcing variable around the cutoff. McCrary (2008) came up with a testing method for this purpose. This will tell you again whether some manipulation is going on around the threshold.

With respect to point 2., you can use your controls as outcome variables, re-run the analysis and see if they jump at the threshold. They shouldn't, otherwise your analysis will be in trouble. Other placebo tests of this sort are suggested in Imbens and Lemieux (2008).

Matifou · Answer 2 · 2014-05-21T06:13:45.703

Note that most of these tests are readily available in the R package RDDtools, that offers: regression sensitivity analysis (plot of bandwidth sensitivity, placebo plot) as well as design sensitivity analysis ( McCrary test of manipulation, test of equality of covariates around the threshold).

A few examples:

Bin plot of the raw data: use plot(RDD data)