2

There is a significant 3-way interaction in a data-set I'm working with. The interaction involves 2 categorical (with two levels each) and a quantitative variable.

I think I understand how to interpret this interaction, but I would like to be able to plot in such a way that all the aspects of the interaction are shown in a graph. I was looking for other posts asking similar questions, and I found this one, asking a similar question but with two quantitative and a categorical variable.

Would there be a way to plot my results in a similar way to the one described in the link I included?


Below is a variant of the data I describe in my post, but with fake data. The lme models I'm running look exactly the same as I describe this:

rm()
set.seed(1)
obs <- data.frame(v1=sample(c("A","B"), size=100, replace=TRUE), 
                  v2=sample(c("Y","Z"), size=100, replace=TRUE), 
                  v3=sample(1:10, size=100, replace=TRUE), 
                  dv=rnorm(100)) 
model <- lm(dv~v1*v2*v3, data=obs) 
anova(update(model,.~.-v1:v2:v3), model)
sayah
  • 21
  • 3

1 Answers1

1

The thing to understand about interactions is there is a different relationship between $X_1$ and $Y$ at each point in $X_2$. This leads to a 'twisted' surface (like what you see in the plot in the linked answer), often in a high-dimensional space. In your case, two of the three interacting variables are binary. That simplifies things quite a bit. Because v3 and dv are continuous, they can be conceptualized / visualized as lines. The interaction is then represented by four different ($2\times 2$) lines, which will all fit nicely on a single plot.

Here I have augmented your data to create an interaction, and I make an example plot:

rm();  set.seed(1)
obs <- data.frame(v1=sample(c("A","B"), size=100, replace=TRUE), 
                  v2=sample(c("Y","Z"), size=100, replace=TRUE), 
                  v3=sample(1:10, size=100, replace=TRUE), 
                  dv=rnorm(100)) 
# here I augment the dv so there is an interaction
B      = ifelse(obs$v1=="B", 1, 0);  Z = ifelse(obs$v2=="Z", 1, 0)
obs$dv = with(obs, .6*B + .7*Z + .8*v3 + .5*B*Z + .4*B*v3 + .3*Z*v3 - .9*B*Z*v3 + dv)
model <- lm(dv~v1*v2*v3, data=obs) 
anova(update(model, .~.-v1:v2:v3), model)
#   Res.Df     RSS Df Sum of Sq      F    Pr(>F)    
# 1     93 130.849                                  
# 2     92  98.816  1    32.033 29.824 3.997e-07 ***

  ## function just to save typing:
myplot = function(main){
  x = 1:10
  plot(x,  predict(model, newdata=data.frame(v1="A", v2="Y", v3=x)), type="l", lwd=2, 
       xlab="v3", ylab="dv", ylim=c(0,13), main=main, col="#a6cee3")
  lines(x, predict(model, data.frame(v1="A", v2="Z", v3=x)), lty=2,lwd=2,col="#1f78b4")
  lines(x, predict(model, data.frame(v1="B", v2="Y", v3=x)), lty=3,lwd=2,col="#33a02c")
  lines(x, predict(model, data.frame(v1="B", v2="Z", v3=x)), lty=4,lwd=2,col="#b2df8a")
  legend("topleft", legend=c("A, Y", "A, Z", "B, Y", "B, Z"), lty=1:4, lwd=2, pch=1:4,
         col=c("#a6cee3", "#1f78b4", "#33a02c", "#b2df8a"))
}
windows()
  layout(matrix(1:4, nrow=2, byrow=T))
  myplot(main="A, Y")
  with(subset(obs, v1=="A"&v2=="Y"), 
       points(v3-.2, dv, col=rgb(166,206,227, 200,NULL,255), pch=1))
  myplot(main="A, Z")
  with(subset(obs, v1=="A"&v2=="Z"), 
       points(v3-.1, dv, col=rgb(31,120,180,  200,NULL,255), pch=2))
  myplot(main="B, Y")
  with(subset(obs, v1=="B"&v2=="Y"), 
       points(v3+.1, dv, col=rgb(51,160,44,   200,NULL,255), pch=3))
  myplot(main="B, Z")
  with(subset(obs, v1=="B"&v2=="Z"), 
       points(v3+.2, dv, col=rgb(178,223,138, 200,NULL,255), pch=4))

The main idea is just to plot the four lines. It is sometimes helpful to see them superimposed on the data. Your mileage may vary here. I thought the plot was too busy to be useful with all the data. So instead, I opted for four parallel plots, all with all the fitted lines drawn, but each with only one subset of the data.

enter image description here

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650