Comparing the random effects and fixed effects models

Question

Consider the random effects model $y_{it} = x_{it}'\beta + \mu_i + \nu_{it}$ where the composite error is $\mu_i + \nu_{it}$. We transform the variables and the error term of the regression equation using $\lambda = 1 - \sqrt{\frac{\sigma^2_\nu}{\sigma^2_\nu+T\sigma^2_\mu}}$. For example the transformed dependent variable is $y_{it}-\lambda\bar{y}_i$. Books give the typical story that if $\sigma^2_{\mu}$ is very large, then $\lambda$ is going to 1 and $y_{it}-\lambda\bar{y}_i$ becomes $y_{it}-\bar{y}_i$ which is the within transformation and hence we end up with the fixed effects model. I have two questions.

Question 1. Why in the random effects model we assume that $\mu_i$ is a random variable whereas in the fixed effects model it is not random? In both models $\mu_i$ differ across individuals. Because the books give an interpretation to $\sigma^2_{\mu}$ above and make the fixed effects model as a special case of $\sigma^2_{\mu}$ being very large or being close to 0. So this suggests that we could in fact treat $\mu_i$ as random in the fixed effects model. Why do not we?

Question 2. Why the fixed effects model is called a fixed effects model? What is fixed in the model? $\mu_i$ does not vary within individuals and therefore is fixed? But this is also the case in the random effects model.

Question 3. If between $\mu_i$ is of interest in a fixed effects model, why would the fixed effects estimator exploit the within variation? So in the observables of the regression, the fixed effects estimator exploits the within variation, but when it comes to the error term the fixed effects model is concerned about the variation in $\mu_i$ across individuals. Why is this not a contradiction?

Isabella Ghement · Answer 1 · 2018-11-23T01:52:19.127

2

Maybe an intuitive way to think about fixed versus random effects would be to imagine what would happen if you were to repeat your study under similar condions a large number of times.

Fixed Effect

You are interested in how study time impacts reading scores for students enrolled in all 3 schools in your local area: School 1, School 2 and School 3. The 3 schools are the only ones you are interested in. If you were to repeat your study many times, the schools would be the same across your many (random) study samples. In other words, the levels of of the categorical factor school - School 1, School 2 and School 3 - would be fixed across study samples.

Random Effect

You are interested in how study time impacts reading scores for students enrolled in a random sample of 3 schools in your local area. The 3 schools represent a larger pool of schools you are interested in, so you are not interested in them other than the fact they tell you something about the pool of schools. If you were to repeat your study many times, the schools would be the different across your many (random) study samples. The first study sample might include School 1, School 5 and School 8. The second study sample might include School 2, School 4 snd School 10, etc. In other words, the levels of of the categorical factor school represented in any particular study sample would be a random sample from the entire pool of levels.

Putting it all together

In practice, you only have one study sample to work with, so that can make it a bit harder to understand why you treat the effects of a grouping factor such as school as fixed or random. But nothing is stopping you from asking yourself: If I repeated my study under similar conditions, would I get to see the exact same levels of the grouping factor or totally different levels? If the exact same levels, I can treat the levels of the grouping factor seen in the study sample as fixed, knowing these levels are the only ones I am interested in for my study. If totally different levels, I can view the levels of the grouping factor seen in the study sample as a random subset of the larger pool of levels I am really interested in for my study.

Addendum:

Let's say you have a response variable, Weight, you measure for each of a series of children at the same times, where the times consist of Time = 0 (Birth), Time = 1 (5 years), Time = 2 (10 years) and Time = 3 (15 years). If you plot Weight vs. Time for each child, let's say you see an increasing linear relationship with the same slope for each child. In other words, Weight tends to increase as a function of time for each child. Let's also say that Weight increases at the same rate for each child. However, each child starts with a different Weight at birth - some children might start with a weight of 4kg, others with a weight of 3.5kg, others with a weight of 4.5 kg, etc.

If you fit a linear mixed effects model to these data, that is akin to fitting a series of linear model - one for each child - but where you allow the child-specific intercepts to deviate from the intercept for the "average child" by an amount $\mu_i$. For a particular child, this amount could be negative (meaning the child starts out with a below-average birth weight), zero (meaning the child starts out with an average birth weight) or above average (meaning the child starts out with an above-average birth weight). This deviation $\mu_i$ changes randomly from child to child, so you will impose the constraint that it actually follows a normal distribution with mean 0 and a particular, unknown standard deviation $\sigma$.

By imposing this constraint, you are essentially saying that the average child has a deviation of 0 from the average weight at birth and that an equal number of children in your target population have below-average birth weight and above average-birth weight. You are also saying that the chance that the next child you'd look at has below or above average weight at birth is random. Finally, you are saying that 95% of the children in your target population have deviations $\mu_i$ falling within +/- 2 $\sigma$ from 0.

To sum up, you are treating $\mu_i$ as a random intercept effect. This random effect is meant to capture the net effect of all the (unobserved or perhaps observed but not currently included in the model) subject-specific influences which conspire to affect how large/small a baby's weight is at birth. Usually we treat $\mu_i$ as a random intercept effect when the babies included in the study are selected so as to represent a larger pool of babies. So it's not just that the $\mu_i$'s differ across individuals - they would have to differ randomly from one individual to another (i.e., in unpredictable rather than systematic ways). This model is called a random effects model because the birth weights of the babies are assumed to vary randomly from one baby to another. In practice, you would want to have at least 5 babies to contemplate such a model and the 5 babies would have to be selected at random at birth - perhaps from the same hospital and then followed up until they are 15-years old.

In a fixed effects setting, you would only focus on a small number of babies but those would be the only ones you would care about and you wouldn't want to generalize the findings of your fixed effects model to any other children from a larger pool - you would simply be interested in those 5 concrete children. If the random effects model assumes that children are somewhat similar - they all have birth weights in the vicinity of the average birth weight, the fixed effects model does not make any assumption of similarity among birth weights of concrete children. In fact, it treats those birth weights as distinct quantities which have nothing to do with each other.

edited Nov 23 '18 at 01:52

answered Nov 08 '18 at 14:46

Isabella Ghement

18,164
2
22
46

Thanks for this elaborate answer. A similar explanation is given here https://stats.stackexchange.com/questions/151784/fixed-random-effects-model. The answer makes intutitive sense. But I am not convinced. – Snoopy Nov 09 '18 at 13:46
1. That $\mu_i$ is fixed means that $\mu_i$ has no distribution. But $\lambda = 1 - \sqrt{\frac{\sigma^2_\nu}{\sigma^2_\nu+T\sigma^2_\mu}}$ tells that to end up with the fixed effects model $\sigma^2_\mu$ should very large. But if we are talking about a very large $\sigma^2_\mu$, then $\sigma^2_\mu$ must first exist and then it must be very large. But instead we say that it does not exist. Is not there a contradiction here? – Snoopy Nov 09 '18 at 13:46
2. Why should I expect the exact same levels if I repeated my study? Suppose the number of entities is large. If I repeat my study, the probability that I will get the exact same $\mu_i$ for every entity is very low. – Snoopy Nov 09 '18 at 13:46
3. Based on some sample data, $\mu_i$ can be estimated and the variance of $\mu_i$ can be calculated as if we have assumed that it has a distribution with variance $\sigma^2_\mu$. If I can estimate this variance, what does it mean that it does not exist? – Snoopy Nov 09 '18 at 13:46
Whether or not you should expect the same levels is NOT based on statistical considerations but rather on subject-matter considerations. As the investigator who plans & conducts the study, YOU decide what you are interested in. Grant it, sometimes the decision is not as clearly cut. – Isabella Ghement Nov 09 '18 at 14:03
Be it subject-matter. Why should I expect to get the exact same 1000 $\mu_i$ if I repeated my study over 1000 other entities? I am also seeking answers to my other questions on the formulas above. – Snoopy Nov 09 '18 at 14:07
The number of entiries can be large indeed, but if you decide you only want to learn something about 5 of those entities (e.g., how does each entity perform by itself, how do entities compare to each other), then you will includea fixed effect for entity in your model. So the number is not what drives your decision, but rather what YOUR focus is as an investigator, what YOU want to learn about. If you wanted to learn about ALL entities rather than just those specific five, you would add a random entity effect to your model. – Isabella Ghement Nov 09 '18 at 14:10
Because YOU would make a conscious decision to include only those 5 entities you care about each time you were to conduct your study. If, for example, you wanted to see if the results of your initial study replicate across a series of two new studies, you would make sure that each of the two new studies investigates the same 5 entities which are your sole focus. – Isabella Ghement Nov 09 '18 at 14:13
I think you have to think about this issue from the investigator perspective - you seem to come at it from the statistician's perspective, which impedes your ability to understand the nuances involved. – Isabella Ghement Nov 09 '18 at 14:14
Why should I be interested in 5 out of 1000 entities? Suppose I am interested in 20. What is probability of getting exact same 20 $\mu_i$ if I repeated my study? Close to 0.I do not find the argument satisfactory. The example in general makes sense but it does address my question of why a distribution does not exist. – Snoopy Nov 09 '18 at 14:18
I hope others on this site will address your theoretical answers - I just wanted to provide some intuition through a simplified example. For more intuition on random effects - which have many more nuances than captured in my simplified example - see Hodges's article: http://www.biostat.umn.edu/~hodges/PubH8492/Hodges-ClaytonREONsubToStatSci.pdf. – Isabella Ghement Nov 09 '18 at 14:18
The probability of getting the exact same 20 is 1 because you would set up your next study (and the one after that, etc.) to make darn sure those exact same 20 would be captured. If those 20 are THE ONLY ONES YOU CARE ABOUT (for a fixed effect situation), why would you want your next study (and the one after that, etc.) to include 20 completely different entities? It just doesn't make any conceptual sense. – Isabella Ghement Nov 09 '18 at 14:22
Why would you select the same 20 entities if you are repeating your study? That is not random sampling. And beside that, what is the point of repeating the study if you are going the select the same 20 entites in your repeated study? That does not make any sense. – Snoopy Nov 09 '18 at 14:47
If you are familiar with the replicability crisis in science, you would know that researchers are concerned about the results of an one-off study not necessarily being replicable across future similar studies. So they would set out to conduct similar studies to see whether replicability of findings is achieved. If the original study involved 3 hospitals (the only ones of interest in the study) and 90 patients (30 patients selected at random from each hospital), then any future studies for investigating replicability of findings would still include only the 3 hospitals. To be continued... – Isabella Ghement Nov 09 '18 at 16:10
But the 3 random samples of patients selected from each of the 3 hospitals would be different across replications of the original study. That's what I am talking about! – Isabella Ghement Nov 09 '18 at 16:11
If the 3 hospitals are the only ones you care about in your original study and all of its subsequent reenactments, then you would include a fixed hospital effect in your modeling for each study. – Isabella Ghement Nov 09 '18 at 16:13
If instead the original study included a random selection of 5 hospitals (say) and then selected 30 patients at random from each hospital, then any future replication of that study would include a different random sample of 5 hospitals and also different random samples of 30 patients from each of those new hospitals. In this case, the modeling for each study would include a random hospital effect. – Isabella Ghement Nov 09 '18 at 16:18
No, the interest does not and cannot lie in the exact 3 hospitals. It lies in whether the estimates of $\mu_i$ are the same, or close to each other (because there is no reason to expect exact same estimates) if the study is replicated. There is no sense of keeping the hospitals same, not from the random sampling perspective, or any other perspective. I cannot imagine that the replicability crisis is about not keeping the hospitals same across studies. On the contrary, if other 3 hospitals are selected at random, the results should still hold. If they do not, that is the crisis. – Snoopy Nov 09 '18 at 16:53
I do not know why patients are brought into the picture now because the entities of interest are hospitals not patients; the dependent is about hospitals not patients, at least this is the sense the discussion shall develop here. – Snoopy Nov 09 '18 at 16:57
You cannot have a random effect unless you have some sort of grouping going on - for example, the hospital is the grouping variable and the patients within hospitals are giving you the data you need for a random effects model. As an example, you might relate patient age to patient blood height (Y) via a linear mixed effects model. For ease, let's say patients are newborn babies. The model could include a random effect for hospital. – Isabella Ghement Nov 10 '18 at 02:24
That means that you believe the relationship between age and height of newborns has the same slope across hospitals but a different intercept for each hospital. – Isabella Ghement Nov 10 '18 at 02:29
I meant to say the patients are children who are newly born - anywhere from 0 to 7 days old, say. – Isabella Ghement Nov 10 '18 at 02:37
It's the same in my original example - you care about study time impact reading scores for STUDENTS within SCHOOLS. – Isabella Ghement Nov 10 '18 at 02:51
I could be more specific. The random effects model I am interested in is where entities are observed over time. See the model in the original post. $i$ indexes individuals, and $t$ indexes time. There is grouping with respect to time. Hence, I still do not have answers to my questions. – Snoopy Nov 18 '18 at 18:47
@Snoopy: I added an Addendum in case that helps, though I have a feeling you are still looking for a theoretical answer. Maybe others can chime in - questions on this forum are answered by volunteers and there is no guarantee you will receive the answers you are looking for. – Isabella Ghement Nov 23 '18 at 01:53
1

I very much appreciate the replies regardless. – Snoopy Nov 24 '18 at 14:12

Comparing the random effects and fixed effects models

1 Answers1