Level plot for continuous x continuous interaction with continuous response

Question

Recently, I came across a publication using a level/tile/contour plot to illustrate the relationship between two continuous variables and a continuous response (with the input variables on the x and y axes and the response plotted as a color). This struck me as a very intuitive way to plot this type of interaction, but I have since had difficulty finding similar uses of this type of plot in this context.

Here is a contrived example of such a plot, using mtcars and ggplot2 in R.

lm.mod <- lm(mpg ~ wt*hp, data = mtcars)
summary(lm.mod)

...
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 49.80842    3.60516  13.816 5.01e-14 ***
wt          -8.21662    1.26971  -6.471 5.20e-07 ***
hp          -0.12010    0.02470  -4.863 4.04e-05 ***
wt:hp        0.02785    0.00742   3.753 0.000811 ***
...

prepplot <- as.data.frame(matrix(ncol = 3, nrow = 10000))
colnames(prepplot) <- c("hp", "wt", "est.mpg")

prepplot$hp <- rep(seq(52,335, length.out = 100), 100)
prepplot <- prepplot[order(prepplot$hp),]
prepplot$wt <- rep(seq(1.513,5.424, length.out = 100), 100)
prepplot$est.mpg <- 49.80842 - 8.21662*prepplot$wt - 0.12010*prepplot$hp + 
0.02785*prepplot$wt*prepplot$hp

ggplot(prepplot, aes(wt, hp, fill = est.mpg)) + 
  geom_tile() +
  xlab("Weight (1000 lbs.)") + ylab("Horsepower") +
  scale_fill_gradientn(colours = c("#b2182b","#d6604d","#f4a582","#fddbc7","#f7f7f7","#d1e5f0","#92c5de","#4393c3","#2166ac")) +
  scale_x_continuous(expand = c(0,0)) +
  scale_y_continuous(expand = c(0,0))

Is fair to interpret this plot in this way: "The color of a given coordinate represents the predicted MPG for that weight and horsepower"? If so, how might someone create this plot for two interacted terms in a regression with even more predictors? Does this plot commit a statistical fault I am not considering?

score 1 · Accepted Answer · answered Oct 14 '17 at 06:26

1

Yes, your interpretation of the colors makes perfect sense.

(Incidentally, this kind of plot is called a "heat map" - a level plot usually uses contour lines to indicate discrete levels of data, and colors are secondary.)

If you have even more predictors, say three, a sensible thing would be to plot heatmaps like yours for two of the three predictors at discrete levels of the third predictor. For instance, at the three quartiles of the observed values of the third predictors.

With four or more predictors, I usually despair of visualizing.

This earlier question may be helpful: How to investigate a 3-way interaction?.

answered Oct 14 '17 at 06:26

Stephan Kolassa

95,027
13
197
357

Hi Stephan, thanks for your response and your clarification on the terminology. The 3d plots you illustrate in the earlier question are very neat. I was hoping to use this type of figure in a presentation to help make a complicated interaction a bit more clear, however my regression has more than 3 predictors. Would it be erroneous to create such a plot where color represents the "predicted numeric change in the response variable" (with other variables at baseline), and color is on a +/- scale? – Tobleroni Oct 16 '17 at 15:47
I think that would be fine. It sounds complicated, though, so I'd recommend that you discuss your plots with someone before presenting them, to see whether they actually convey the information you'd like to convey. – Stephan Kolassa Oct 16 '17 at 18:02

Level plot for continuous x continuous interaction with continuous response

1 Answers1