8

I'm aware of the fact that first differences and fixed effects are both designed for the same solution -- removing unobserved unit-level effects.

However, I'm unclear on what happens when you include a unit-level dummy in a first differences model (I've seen this done for error-correction models as well as elsewhere). Does including the dummy essentially demean the differences (sort of a soft unit-level time trend?). Or does it simply introduce bias?

amoeba
  • 93,463
  • 28
  • 275
  • 317
reson
  • 167
  • 2
  • 2
  • 10

1 Answers1

11

If you have $N$ individuals and you include $N-1$ individual dummies (one less in order to avoid the dummy variable trap) in an OLS regression like $$y_{it} = X'_{it}\beta + \sum_{i=1}^{N-1}\delta_i (\text{individual}_i) + \epsilon_{it}$$ then this is called a least squares dummy variable (LSDV) regression. In this case, each individual dummy will "absorb" the individual fixed effects $u_i$ that are hidden in the error term $\epsilon_{it} = u_i + e_{it}$.

Mundlak (1978) has shown that the LSDV regression is equivalent to the fixed effects estimator: $$y_{it} - \overline{y}_{i} = (X_{it} - \overline{X}_i)\beta + \epsilon_{it} - \overline{\epsilon}_i$$ where $\overline{y}_{i} = \frac{1}{T}\sum^{T}_{t=1}y_{it}$, $\overline{x}_{i} = \frac{1}{T}\sum^{T}_{t=1}x_{it}$, and $\overline{\epsilon}_{i} = \frac{1}{T}\sum^{T}_{t=1}\epsilon_{it}$. Back in the days when computers weren't very fast, having large panels basically made LSDV infeasible because there were too many dummies. Therefore Mundlak's finding was very useful because it dispenses of including all these individual dummies and instead using the within transformation made things much simpler.

So if you do a fixed effects regression you don't need to include all individual dummies. In fact, your statistical software will just drop them should you include them in a fixed effects regression. Also in a first differences regression the individual dummies will drop out because they do not change over time, hence the difference is zero for all the dummies and then your statistical software will omit them due to perfect collinearity. Doing either fixed effects or first differences already solves the problem of time-invariant unobserved variables ($u_i$). LSDV is just another way of doing it and for this reason it won't help you to combine it with the other methods.

When you include individual dummies after first differencing your other variables, i.e. a first differences regression with individual dummies, those dummies will estimate individual trend effects (see page 77, footnote 1 in the notes here).

Andy
  • 18,070
  • 20
  • 77
  • 100
  • Angrist and Pischke in Mostly Harmless Econometrics, chap. 5.1., state: "With two periods, [first-]differencing is algebraically the same as deviations from means, but not otherwise. Both should work, although with homoskedastic and serially uncorrelated [errors] deviations from means is more efficient." I'm not competent enough to see where the efficiency gain would come from. Maybe because you lose N observations when you do first differences? – Julian Schuessler Jul 02 '14 at 18:53
  • Thanks Andy. I think I didn't necessarily phrase my original question very well. Was more referring to the difference between Δyit =β1ΔXi+β2Di+Δϵit, where D is a unit (or individual) dummy, and Δyit=β1ΔXi+Δϵit. Note that D is not necessarily dropped in the first case. – reson Jul 02 '14 at 18:59
  • @JulianSchuessler it's true that first differences and LSDV/FE are the same only for 2 periods. What I said was that LSDV and FE are equivalent. reson, I will update the answer but if first differencing already eliminates unobserved fixed effects, why would you still want to include the individual dummies? If you give me an example it will be easier to produce a good answer for your problem. – Andy Jul 02 '14 at 19:17
  • 2
    Perhaps to adjust for distinct unit-level time trends caused by unobserved factors. – reson Jul 02 '14 at 19:29
  • 1
    Okay, now I get it. So basically the answer is yes. If you do OLS on first differenced variables and individual dummies the dummies will pick up individual trend effects. See for instance [here](http://www.fordham.edu/economics/mcleod/PCGivePanelModeling.pdf) in footnote 1 on page 77. I will adapt my post and hope that this answers the question :) – Andy Jul 02 '14 at 19:32
  • @Andy Commenting on an old post, hope you will still read it…I would be interested to have a look at the link you posted, but it doesn't seem to work anymore. Are the lecture notes still available somewhere? If not, do you know of another source? Thanks! – Matthijs Jan 30 '16 at 21:12
  • 1
    @Matthijs thanks for your comment. I found the lecture notes elsewhere and updated the link. I hope this is useful for you. – Andy Jan 31 '16 at 10:48
  • @Andy Thanks! So if I understand correctly, adding other-than-time-dummies in a first-differenced model is equivalent to adding those same dummies interacted with time in the model in levels (ie, the non-differenced model)? – Matthijs Jan 31 '16 at 12:44