Confused about multilevel analysis and non independence of observations

Question

I'm still struggling with my understanding of multilevel analysis, wondering if it applies or not to my problem. I'v read here the following (where author gives an example of a multilevel model with classrooms):

Classrooms pertain to a level (rather than a predictor variable), since (a) classrooms were randomly sampled from a population of units (classrooms around the world are potentially infinite and you have sampled some of them), and (b) classrooms have no intrinsic meaning per se (classrooms are interchangeable units without theoretical content).

Next, author says the following:

On the contrary, socioeconomic status would for instance pertain to a predictor variable (rather than a level) since its categories are both non-random and theoretically meaningful (e.g. lower, middle, and upper class are not “atheoretical” random units).

So far so good. Where I'm confused is when the author adds:

With (...the classroom as) a data structure, you cannot run a standard logistic regression analysis. The reason is that this violates one of the most important assumptions in the linear model, namely the assumption of independence (or lack of correlation) of the residuals (...). Observations are interdependent: Participants nested in the same cluster are more likely to function in the same way than participants nested in different clusters.

Isn't the same for observations in the different socioeconomic groups? Aren't the observations also interdependent of each other in a specific socioeconomic group? If yes, does it mean that we also should treat socioeconomic status in a multilevel setting?

EDIT: to clarify about comment below, suppose you want to do a linear regression based on socioeconomic status. Is there a fit for multilevel analysis. I would be inclined to say no since the categories are not random. On the other side, observations in one category could be interdependent. What can I do?

Generally, we do not think there is residual in logistic regression. — user158565, May 01 '19 at 19:46
Thank you @user158565. I think that you can forget about the logistic part here. My question is about the multilevel analysis vs non independence of observations wrt random groups and non random groups. — Patrick, May 01 '19 at 21:38
If you want to explain in more details, you're welcome! For my part, I added an edit section to clarify my question. Thanks again. — Patrick, May 03 '19 at 13:43
@user158565 I confess I don't understand what you are trying to say, either, because I consider the *deviance* to be a form of residual. That suggests you might be using "residual" in a special or narrow way, but in what way, exactly, and to what purpose? Note, too, that the quotation in question doesn't literally mean "residuals," but really is talking about how the *conditional responses* are modeled as independent random variables. Its last sentence clarifies this by stating that the problem is "*observations* [not residuals] are interdependent." — whuber, May 03 '19 at 14:06
This might be of value to you https://stats.stackexchange.com/questions/4700/what-is-the-difference-between-fixed-effect-random-effect-and-mixed-effect-mode Ben Bolker's answer is basically the same as your textbook's. — Huy Pham, May 03 '19 at 15:28
Interesting link (I had already read it by the way). Unfortunately it does not tell me if a variable like socioeconomic status could be a random effect or not (observations could be interdependent but groups are not random). Maybe @whuber has an answer to offer to that question? Thanks again. — Patrick, May 03 '19 at 18:33
I would suggest that the answer can be obtained by referencing a probability model for the statistical problem: that will clearly distinguish fixed from random effects. Because a given problem or given dataset often can be (validly) modeled in distinctly different ways, there is scope for confusion if the model is not explicitly invoked. — whuber, May 03 '19 at 19:28
@whuber It seems we agree that referred book/article has some problems on the descriptions of the logistics model. Then do you think the OP still needs stick on this book/article, instead of finding other error free material to learn? — user158565, May 04 '19 at 20:58

Confused about multilevel analysis and non independence of observations

0 Answers0