2

I am testing an interaction effect where $X$ and $Y$ are continuous variable and $M$ (Moderator) is a categorical variable (effects coding $+1$, $-1$).

I have no clue about how to do a post-hoc probing of slopes in this case.

For a continuous moderator we calculate $Z_{\rm above}$ and $Z_{\rm low}$, calculate their crossproduct with $X$ (e.g., $Z_{\rm Above}*X$ and $Z_{\rm low}*X$) and then run the regression with their combinations of ${\rm High}$ and ${\rm Low}$ with $Y$.

I do not see that the same method above could be applied for a categorical moderator. How can I test that?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Rahul
  • 81
  • 2
  • 6

1 Answers1

4

I'm not sure I follow your strategy with continuous moderators, but the proper approach with categorical moderators is essentially the same as with continuous ones. Effect coding is a perfectly reasonable strategy for representing categorical covariates, and it makes no difference in terms of how you test the interaction. To test an interaction, we add a product term to the model, and test the beta associated with that term. Assume $Y$ is normally distributed, so that we're talking about a standard linear regression model, and that $X$ and $Z$ are continuous. Let's say that you are primarily interested in the relationship between $X$ and $Y$, but think the specific nature of that relationship may depend on the value of $Z$, so that we will call $Z$ a moderator here. Then, the model is: $$ \hat{Y}=\hat{\beta}_0+\hat{\beta}_1X+\hat{\beta}_2Z+\hat{\beta}_3XZ $$ The test of $\hat{\beta}_3$ will assess the existence of the interaction. Note that you have to include $Z$ in the model (see here and here for discussions of that issue). Now, let's consider a situation like yours where you wonder if a categorical covariate, $Z$, with two levels moderates the relationship between $X$ and $Y$, then your model would be: $$ \hat{Y}=\hat{\beta}_0+\hat{\beta}_1X+\hat{\beta}_2Z+\hat{\beta}_3XZ $$ exactly the same! Note that the coding strategy for $Z$ is not represented--it's irrelevant, reference cell coding or any other valid coding scheme would be employed and tested the same way. Again, you examine the test of $\hat{\beta}_3$ to see if the moderation is 'significant'.

The situation is a little more complicated if your categorical covariate has more than two levels. As you probably know, categorical covariates with $k$ levels are represented by $k-1$ 'dummy' variables. Thus, for example, if $k=3$, you need two new variables. Let's assume the situation is as above, but you are wondering if the relationship is moderated by $Z$, a categorical covariate with an arbitrarily large number of levels. Then the model would be: $$ \hat{Y}=\hat{\beta}_0+\hat{\beta}_1X+\hat{\beta}_2Z_1+\hat{\beta}_3Z_2+\cdots+\hat{\beta}_kZ_{k-1}+\hat{\beta}_{k+1}XZ_1+\hat{\beta}_{k+2}XZ_2+\cdots+\hat{\beta}_{2k-1}XZ_{k-1} $$ That's an ugly formula, but it's the way it's done. The important part is this: because you now have more than one $Z$ variable, to test if the moderation is 'significant', you drop all $k-1$ interaction terms, fit the reduced model, and perform a nested model test. In a standard linear regression context like the situation we're assuming here, that can be the F change test: $$ F_{change}=\frac{\left(\frac{SSE_{reduced}-SSE_{full}}{k-1}\right)}{\left(\frac{SSE_{full}}{df_{full}}\right)} $$ where $SSE$ is the sum of squared errors from the ANOVA table, and $p$ is the number of parameters (betas) you are estimating for that model. This $F_{change}$ value is assessed against the $F$ distribution with $(k-1,df\ {\rm error}_{full})$ degrees of freedom. If you were working with the generalized linear model, you would use the likelihood ratio test instead. (NB, most software can do these tests for you; e.g., in R the anova() command can perform nested model tests.)

To understand the effect of the moderation, I think it's best to make a scatterplot of the data and superimpose several regression lines over the points, one for each level of the moderator (i.e., $k$ lines, not $k-1$). In addition, it's typically best to plot the points associated with the different levels of $Z$ with different symbols and colors. If your moderator is continuous, it's often convenient to plot lines at the mean of $Z$, 1 SD above the mean and 1 SD below, which is what I think you are referring to in the question.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • Thanks gung for ur detailed answer.I didn't get notified of this answer.Sorry for late reply. I am actually interested to know how to conduct a post-hoc probing once βˆ3 is confirmed to be significant. Please have a look at [this tutorial][1] where how to conduct a post-hoc analysis is explained for continuous moderators (p - 45-53). I want to do it for categorical one. You mention that - "it's often convenient to plot lines at the mean of Z, 1 SD above the mean and 1 SD below". Yes, it is. But can't I do it for categorical moderators? That is my question. [1]: http://goo.gl/JN26d – Rahul Aug 12 '12 at 23:00
  • @Rahul, only the last sentence is about how you would adapt the procedure for continuous moderators, the bulk of the last paragraph explains how to do this for continuous moderators. – gung - Reinstate Monica Aug 13 '12 at 01:30