12

Consider a linear unobserved effects model of the type: $$y_{it} = X_{it}\beta + c_{i} + e_{it}$$ where $c$ is an unobserved but time-invariant characteristic and $e$ is an error, $i$ and $t$ index individual observations and time, respectively. The typical approach in a fixed effects (FE) regression would be to remove $c_{i}$ via individual dummies (LSDV) / de-meaning or by first differencing.

What I have always wondered: when is $c_{i}$ truly "fixed"?

This might appear a trivial question but let me give you two examples for my reason behind it.

  1. Suppose we interview a person today and ask for her income, weight, etc. so we get our $X$. For the next 10 days we go to that same person and interview her again every day anew, so we have panel data for her. Should we treat unobserved characteristics as fixed for this period of 10 days when surely they will change at some other point in the future? In 10 days her personal ability might not change but it will when she gets older. Or asked in a more extreme way: if I interview this person every hour for 10 hours in a day, her unobserved characteristics are likely to be fixed in this "sample" but how useful is this?

  2. Now suppose we instead interview a person every month from the start to the end of her life for 85 years or so. What will remain fixed in this time? Place of birth, gender and eye color most likely but apart from that I can hardly think of anything else. But even more importantly: what if there is a characteristic which changes at one single point in her life but the change is infinitesimally small? Then it's not a fixed effect anymore because it changed when in practice this characteristic is quasi fixed.

From a statistical point it is relatively clear what is a fixed effect but from an intuitive point this is something I find hard to make sense of. Maybe someone else had these thoughts before and came up with an argument about when a fixed effect is really a fixed effect. I would very much appreciate other thoughts on this topic.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Andy
  • 18,070
  • 20
  • 77
  • 100
  • 2
    +1, good question & good answers. Perhaps it's worth remembering that, `"all models are wrong, but some are useful"` -[George Box](http://stats.stackexchange.com/questions/726/famous-statistician-quotes/730#730). – gung - Reinstate Monica Jun 29 '13 at 22:53
  • I'm probably confused about this, but isn't the continuum: 1) if $c_i$ is treated as the same for all $i$, you have a pooled model, 2) if $c_i$ is treated as the same for all $z_{j[i]}$ (dummy variables for groups, which could include "year" or "day"), you have a FE model, and 3) if $c_{j[i]}$ is treated as a distribution, you have a RE model. See: http://userwww.service.emory.edu/~tclark7/randomeffects.pdf . – Wayne Jun 30 '13 at 01:11

3 Answers3

9

If you are interested in this formulation for causal inference about $\beta$ then the unknown quantities represented by $c_i$ need only be stable for the duration of the study / data for fixed effects to identify the relevant causal quantity.

If you are concerned that the quantities represented by $c_i$ aren't stable even over this period then fixed effects won't do what you want. Then you can use random effects instead, although if you expect correlation between random $c_i$ and $X_i$ you'd want to condition $c_i$ on $\bar{X}_i$ in a multilevel setup. Concern about this correlation is often one of the motivations for a fixed effects formulation because under many (but not all) circumstances you don't need to worry about it then.

In short, your concern about variation in the quantities represented by $c_i$ is very reasonable, but mostly as it affects the data for the period you have rather than periods you might have had or that you may eventually have but don't.

conjugateprior
  • 19,431
  • 1
  • 55
  • 83
  • +1 I like this answer. But what about an incredibly small change in something that is supposed to be fixed over the sample period? If my person in the 10 days sample hits her head on day 6 and is less intelligent afterwards by an infinitesimally small amount represented by the brain cells that died (just as a trivial example): can her ability still be treated as fixed effect if it is almost fixed? – Andy Jun 28 '13 at 13:16
  • 1
    Sure. Maybe think about it like this: it's the *parameter* that's fixed and it may represent something in the world that's 'really' constant, or it may not e.g. if it represents the average of something that actually varies. The question is: what inferential difference does it make to put a fixed effect in rather than something else. In the causal inference case the question is: does assuming fixed effects *decrease* confounding more than the small variations left uncaptured by the parameter *increase* confounding. – conjugateprior Jun 28 '13 at 14:15
  • @Andy: Once you start talking about a bump to the head changing someone's IQ because a few brain cells were traumatized, where does it stop? Nothing you measure in the real world is so fixed that it doesn't change (infinitesimally) on a moment-to-moment basis, if you can measure it accurately enough. You simply have to use reasonable judgement, and be explicit about that judgement when stating your results. As conjugateprior says, fixed effects are also a distinct concept from "unchangeable" and refers to both a specific thing (parameters) and your specific goal (population, group, etc). – Wayne Jun 29 '13 at 16:59
  • You are right that the example with the brain cells is somewhat far fetched. I just wanted to think more about the nature of fixed effects because most text books and lectures are rather silent on this intuitive aspect. Sure they give examples but none of which would answer my questions. For this purpose I found it very useful to bring up this question here and the answers and comments so far were very useful. – Andy Jun 29 '13 at 18:02
2

The distinction between a fixed effect and a random effect has typically no implications on the estimates (Edit: at least in the simple textbook uncorrelated cases), besides a matter of efficiency, but considerable implication for testing.

For the purpose of testing, the question you should be asking yourself is what is the level of noise your signal should surpass? I.e., to what population do you want to generalize your findings? Using example (1): should it be the variability over the same day, a longer period, or the variability over different individuals?

The more variance components you infer over, the stronger your scientific finding, with better chances of replicating. There is naturally a limit to the amount of generalization you can ask for, as not only the noise gets stronger, but also the signal ($E(c_i$)) gets weaker. To see this, imagine $E(c_i)$ is the expected effect of $X_i$ on weight but not over some life periods of a single subject, but rather over all mammals.

JohnRos
  • 5,336
  • 26
  • 56
  • I can follow the rest of your answer but I am doubtful about the first part. Fixed effects allows for arbitrary correlations between the $X$ and the fixed effects, whilst in random effects the two must be uncorrelated. If this is not true, then RE is inconsistent. So this does have implications for the estimates. – Andy Jun 29 '13 at 17:56
  • and even if the random $c$ are uncorrelated with $X$ they'll still be shrunk towards each other relative to fixed $c$ estimates. – conjugateprior Jun 29 '13 at 18:19
  • @conjugateprior: $c_i$ will indeed be shrunk, but the group inference is on $E(c_i)$ which is not shrunk. – JohnRos Jun 29 '13 at 21:54
  • @Andy: I don't see a reason not to allow for correlations between the effects and the noise in RE, but if we agree on the rest of the answer, I rather simply edit my answer. – JohnRos Jun 29 '13 at 21:57
2

I've struggled with similar questions, see A Festschrift (blog post) for Lord, his paradox and Novick’s prediction, and here is my best attempt (hopefully with corrections if I am woefully wrong). If we drop the non-random shocks, $X_{it} \beta$, from the equation we then simply have:

$$y_{it} = c_i + e_{it}$$

Which can be viewed as a random walk by going further back in time:

\begin{align} y_{it} &= c_i + e_{it} \\ y_{it-1} &= c_i + e_{it-1} \\ y_{it} - y_{it-1} &= e_{it} - e_{it-1} \end{align}

So this is just a reframing of conjugate prior's answer "need only be stable for the duration of the study" - but a reframing I find useful. So, during the duration of the study is it reasonable to consider that absent the treatments of interest, the $X_{it} \beta$ part, would the outcome be a random walk, only guided by random exogenous shocks - the $e_{it}$'s? Of course this is not true except for trivially pedantic circumstances.

That is where my advice ends though. As gung mentions the George Box phrase, "all models are wrong, but some are useful". You would know better than I how to determine when this simplification is justified in a particular research design. It can be assumed we can't observe $c_i$ just the same as the random walk is not an accurate representation of reality - even for a tiny slice of time.

I might guess for your particular example of the survey, questions measuring flow type data (e.g. income, weight) may be reasonable as random walks over particularly short time frames. Stock type data though (such as how many coffees did you drink today) it seems a bit more of a perverse presumption.

Andy W
  • 15,245
  • 8
  • 69
  • 191
  • +1 Thanks for the link and your answer! I'm happy that this question still attracts interest and that more can be added to it. This was insightful. – Andy May 07 '14 at 13:19