2

Suppose ses is a level-1 predictor of students math score, and sector is a level-2 predictor denoting the type of schools (public vs. private).

Given above, are the following models legitimate (syntax-wise)?

(Note: I don’t think I can have sector randomly vary at the school level because sector itself is a school-level predictor, am I right?)

library(lme4)

hsb <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')

m1 <- lmer(math ~ ses+sector + (ses:sector | sch.id), data = hsb)
m2 <- lmer(math ~ ses+sector + (ses*sector | sch.id), data = hsb)
rnorouzian
  • 3,056
  • 2
  • 16
  • 40

1 Answers1

1

The 2nd model may not be appropriate because it specifies random slopes for sector which is a level 2 variable, but if you have reason to think that it should vary by school, then it could be OK.

The first model may be appropriate if you have reason to believe the cross level interaction should vary by school, AND it is supported by the data. The problem with this model is that it corresponds somewhat to a model with a (fixed) interaction term but omitting the main effects, and that rarely make sense.


Edit:

I just want to expand on my remark above about fitting random slopes for a variable that does not vary within the grouping factor. Statistically speaking, we might have a dataset where such a model converges without warning. What I am trying to say is that I'm not sure whether I can make sense of such a model. I mean, maybe there are some edge cases, or pathological examples where it does make sense, but I would like to think that in real world situations it would not make sense to specify a model with random slopes for a variable that is constant within groups. One way to help think about this is to plot the data with the outcome on the $y$ axis, the variable in question on the $x$ axis and then to plot a seperate line for each group. This will be impossible since each group's data points will occupy the same value for the $x$ variable and the slope will be undefined.

I should probably admit at this point, to utilising way too much sarcasm in some of my answers on this topic. My justification for doing so is to encourage analysts to think about what their models mean.

Robert Long
  • 53,316
  • 10
  • 84
  • 148
  • Thanks Rob, but cross-level interaction already includes schools, no? So can you a bit clarify? Please remember we used these models [HERE](https://stats.stackexchange.com/questions/489059/obtaining-correlation-between-random-effects-separately-for-2-groups), but now I have serious doubts about the legitimacy of these models? – rnorouzian Oct 07 '20 at 17:50
  • No, I mean the cross level interaction `ses:sector`. The legitimacy of the models depends on 1) whether there is theoretical (from domain knowledge) justification AND 2) whether they are supported by the data. – Robert Long Oct 07 '20 at 18:34
  • Rob, I may be misunderstanding this. But a cross-level interaction (`ses:sector`) can only be found when we consider the relationship between `math ~ ses` as being random (varying across schools) and then consider how this relation differs between `sector==0` and `sector==1` (as depicted in [**this visual**](https://stats.stackexchange.com/q/490302/140365)), the answer becomes a cross-level interaction. Now, I have difficulty understanding how such a cross-level interaction can AGAIN `vary by school` and hence the use of `ses:sector`? (*My question is about understanding the syntax only*) – rnorouzian Oct 07 '20 at 20:13
  • I agree with you completely. I think you have missed my sarcasm again :) I will update my answer. – Robert Long Oct 07 '20 at 20:15
  • 1
    For my edification, I want to see this plot. What would be the variable on the x-axis in the above example given that you noted: *"One way to help think about this is to plot the data with the outcome on the y axis, **the variable in question on the x axis** and then to plot a separate line for each group. This will be impossible since each group's data points will occupy the same value for the x variable and the slope will be undefined."* – rnorouzian Oct 07 '20 at 20:25
  • @rnorouzian ahh good question !! That's rather outside the focus of this question/answer. Please make a new post about how to visualise the folly of fitting random slopes for variables that don't vary within groups, and I will answer it with code and graphics :) – Robert Long Oct 07 '20 at 20:29
  • [HERE](https://stats.stackexchange.com/questions/490919/how-to-visualize-the-folly-of-fitting-random-slopes-for-variables-that-dont-var) is the question. – rnorouzian Oct 07 '20 at 20:35
  • Rob, it seems you got busy, in the meantime, you mean this: `hsb10 – rnorouzian Oct 07 '20 at 21:50