What is the intuition on fixed and random effects models?

Question

Now I'm having a hard time having a grasp on the difference between fixed and random effects of regression models. I believe I understand it's recommended to use random effects if you consider heterogeneity of slopes, when the data is nested among hierarchical levels, etc.

But here's the question.

Why don't we just put moderating variable(interaction term) if we want to reflect the changing effect among different groups? for example, if the effect of study time on GPA differs among different classrooms, then why not just make a dummy variable for classroom variable, and put an interaction term? I cannot understand what the point is here.
What is an overall intuition on the grand assumption of random effects model? what is the main idea that can penetrate the logic of random effects model? I don't want any mathematical or statistical explanation, I want to draw some hypothetical picture in my head.

About the dummy variable, that works if the variable has a limited number of values (like classrooms in your case) but not when there are a hughe number of values and that is the trick; if you have a hughe number of values then you get a hughe number of intercepts (or slopes) thus a lot of dummies and then you can not estimate the model well (you loose many degrees of freedoms because you have a lot of explanatory variables). In that case you can use random effects; i.e. you assume that the intercepts are normally distributed and then your hughe number of dummies ... — , Oct 02 '16 at 08:59
... is ''summarised'' in a normal distribution. The latter has only two parameters (mean and standard deviation), so in stead of estimateing a hughe number of corefficients (namely one for each your dummies) you only have to estimate two parameters (mean and standard deviation) and you know the distribution of the intercepts. This saves a lot of degrees of freedom. — , Oct 02 '16 at 09:01
wow. thanks it really helped. So many people gave explanations in an inconsistent way, but I think this summarized everything up. — Kang Inkyu, Oct 02 '16 at 09:37
@fcop one more question, does your explanation have something to do with clustering effect(or nested effect if you will)? I cannot find an intuitive link between them. Is your comment appliable to the logic of clustering? Sorry for my bad English by the way. — Kang Inkyu, Oct 04 '16 at 12:23

score 12 · Answer 1 · answered Oct 02 '16 at 12:13

One way to think about fixed-effects vs. random effects is by examining how the fixed-effects estimator works in comparison to the random effects estimator.

Let's say I have panel data on firms. Let $y_{i,t}$ be dividends for firm $i$ at time $t$. Let $x_{i,t}$ be something we're looking at like free cash flow.

Imagine our model is:

$$ y_{i,t} = \beta x_{i,t} + u_i + \epsilon_{i,t} $$

So dividends for firm $i$ at time $t$ are the sum of $\beta$ times free cash flow plus a firm specific effect $u_i$ and a firm, time specific error-term $\epsilon_{i,t}$. Now let's imagine two different estimators:

The within estimator. $\beta$ is estimated using only time-series variation within each firm.
The between estimator. $\beta$ is estimated using only the variation between different firms. (The between estimator is $\beta$ from the cross-sectional regression $\bar{y}_i = \beta \bar{x}_i + v_i$.)

The within estimator is the fixed-effect estimator. It takes off the mean from each group and the only variation leftover to estimate $\beta$ is time series variation within each firm. If the fixed effects can be anything, this is what you have to do.

The random effects estimator is a weighted average of the within estimator and the between estimator. If the effects $u_i$ are random and mean zero, then variation between firms also contains information about $\beta$ and the between estimator is also a consistent estimator. Rather than tossing out the between firm variation (as occurs in the fixed effect estimator), the between firm variation is given some weight in the random effects estimator of $\beta$.

score 6 · Answer 2 · edited Apr 13 '17 at 12:44

6

You can start with this thread. As already noted in comments by fcop one example of using random effects is then you have multiple levels of your variable (classrooms) and estimating so many parameters would require large amounts of data and huge computational power. In such cases often you wouldn't be interested in classroom effects themselves, but their influence in general, you would assume that they vary but can be summarized using common distribution. It also could be the case that you have just a sample of classrooms and the particular classrooms are not interesting by themselves, but are used to learn something about general variability that is connected with classrooms. So you use random effects what you are not interested in estimating the parameters for your variable precisely, yet you want to account for influence of such variable by estimating the distribution of possible influences of it's levels.

edited Apr 13 '17 at 12:44

Community

1

answered Oct 02 '16 at 09:12

Tim

108,699
20
212
390

Thanks a lot. I wanna ask one more additional question: – Kang Inkyu Oct 02 '16 at 09:38
Say I wanna analyze a dataset of 100 samples. If I create a dummy that represents each sample(say the name of the variable is Individual Nature, IN) then I would need 99 dummy variables. But I'm not really interested in the individual effects themselves(as many common analysis are) but I want a summarized common distribution, which means I wanna consider all people of sample about the same. If this is the case, can I call it random effects model? or is this a fixed effects model because I only have 1 level of data? – Kang Inkyu Oct 02 '16 at 09:44
@KangInkyu you wouldn't be able to estimate neither fixed, nor random effects for such data. In each case you would need at least few cases per *group*. How would you imagine mean and variance of single measurement? There's nothing to estimate from single point. – Tim Oct 02 '16 at 10:49
1

Oh, I see what you mean. I think I was confused it from individual fixed vs individual random effects of panel data. Panel data contains a couple of samples within each individual, so in that case I think it makes sense to make a random effects model considering each individual sample as a group, right? Thanks for the comment. Now I'm getting closer and closer to the ultimate grasp. – Kang Inkyu Oct 02 '16 at 11:01
one more question, does your explanation have something to do with clustering effect(or nested effect if you will)? I cannot find an intuitive link between them. Is your comment appliable to the logic of clustering? Sorry for my bad English by the way. – Kang Inkyu Oct 04 '16 at 12:23
1

@KangInkyu answering shortly: yes it does. You can find clear introduction in this book http://www.stat.columbia.edu/~gelman/arm/ – Tim Oct 04 '16 at 12:26
thanks. But could you please refer to your example above? Why calculating summarized common distribution of school variable is the same as considering clustering effect? I understand both of the logics respectively, but cannot get the link. Please give a brief comment if it does not bother you. Thanks again. – Kang Inkyu Oct 04 '16 at 12:32
1

@KangInkyu example: if classes are nested within schools, then they have something in common, they come from similar distribution -- this is what nesting is about. – Tim Oct 04 '16 at 12:39
that makes a lot of sense to me! But what if there is only two level: students and classes? You said previously that students(level 1) cannot be made as random. But in this case it is the student which is nested in the class, which contradicts to your comment. Am I getting something wrong here? – Kang Inkyu Oct 04 '16 at 12:44
@KangInkyu in such case variance for students is residual variance. The idea is that each level has it's own variance, for "level 1" you already have it in ordinary linear regression. – Tim Oct 04 '16 at 12:56
Final question: so that means with two-level data only ordinary linear regression is available? Because students are nested within classes but they are level 1 so should be regarded fixed and their variance is in ordinary linear regression. And also classes are nested within nowhere, so it should be regared fixed. Therefore fixed effects model is recommended? It can look somewhat stupid, but I'm a novice so please forgive. And thanks for all the comments. I'll wait for the final reply. – Kang Inkyu Oct 04 '16 at 13:01
@KangInkyu I'm sorry but I do not understand your comment... Check the book I linked, it provides nice introduction. – Tim Oct 04 '16 at 13:05
definitely I will check the book. Even after I go over the material and still cannot get the point, then I'll come back. Thanks:) – Kang Inkyu Oct 04 '16 at 13:07

score 5 · Answer 3 · answered Oct 02 '16 at 13:48

About the dummy variable, that works if the variable has a limited number of values (like classrooms in your case) but not when there are a hughe number of values and that is the trick; if you have a hughe number of values then you get a hughe number of intercepts (or slopes) thus a lot of dummies and then you can not estimate the model well (you loose many degrees of freedoms because you have a lot of explanatory variables).

In that case you can use random effects; i.e. you assume that the intercepts are normally distributed and then your hughe number of dummies is ''summarised'' in a normal distribution. The latter has only two parameters (mean and standard deviation), so in stead of estimateing a hughe number of corefficients (namely one for each your dummies) you only have to estimate two parameters (mean and standard deviation) and you know the distribution of the intercepts. This saves a lot of degrees of freedom.

What is the intuition on fixed and random effects models?

3 Answers3

Linked