What is the most reliable way to visualize the mixed model?

Question

this may be a beginner’s question, but any help is appreciated! First, how to interpret the interaction between continuous variables? Such as WORD_LENGTH_SCALE:Freq_SCALE and WORD_LENGTH_SCALE:Freq_SCALE:Freq_2nd?

Here are my codes (Freq_2nd is not applicable to a word length condition. That is, when the word length = 1, Freq_2nd is NA. Thus, i wrote is as (WORD_LENGTH_SCALE:Freq_2nd)):

fm = lmer (response  ~ WORD_LENGTH_SCALE * Freq_SCALE * 
    (WORD_LENGTH_SCALE:Freq_2nd) + (1| subjects) + ( 1 | trail_no),  
    data = data_01, REML = TRUE)

and the results:

 Fixed effects:
                                       Estimate Std. Error t value
(Intercept)                            4.205600   0.015334 274.272
WORD_LENGTH_SCALE                     -0.025104   0.015568  -1.613
Freq_SCALE                            -0.004509   0.001953  -2.309
WORD_LENGTH_SCALE:Freq_SCALE          -0.019313   0.008061  -2.396
WORD_LENGTH_SCALE:Freq_2nd             0.004601   0.002513   1.831
WORD_LENGTH_SCALE:Freq_SCALE:Freq_2nd  0.003309   0.001370   2.416

Correlation of Fixed Effects:
                           (Intr) WORD_LENGTH_SCALE F_SCAL WORD_LENGTH_SCALE:Fr_SCALE WORD_LENGTH_SCALE:F_2
WORD_LENGTH_SCALE          -0.029                                                                          
Freq_SCALE                  0.145  0.093                                                                   
WORD_LENGTH_SCALE:Fr_SCALE -0.015  0.849             0.051                                                 
WORD_LENGTH_SCALE:F_2      -0.024 -0.958            -0.295 -0.825                                          
WORD_LENGTH_SCALE:F_SCALE: -0.033 -0.785            -0.284 -0.957                      0.839

Second, I used the ggeffects, sjPlot and prediction function to plot graphs, but they seem different from the model. In the model, the Freq_SCALE and WORD_LENGTH_SCALE are negative, while in the plot, the Freq_SCALE is negative but the WORD_LENGTH_SCALE is positive. Why is this the case?

Third, what is the most reliable way to visualize the mixed model? sjPlot and the prediction function give me different plots. So I am not sure what to do ...

sjPlot

prediction:

Displaying model results with Interactions is tricky, as the slope of the relationship for one predictor depends on the actual value of the interacting predictor. The negative slope displayed for `WORD_LENGTH_SCALE` in the table is only correct at a `Freq_SCALE` value of 0. Please edit your question to show the values you assumed for `Freq_SCALE` in making your plots. It's possible that different software packages use different default values for interacting predictors. — EdM, Dec 24 '21 at 20:57
You need to show results for _all_ of the coefficients, as you have a 3-way interaction among your 3 fixed predictors. Otherwise one can't reproduce what the `response` values should be in your plots. It looks like your last 2 plots might have been based on different values for `score_SCALE`. It would also be better if, instead of showing an image, you could show the coefficients with the "Code Sample" format in editing the question, using the "{}" icon in the format bar, because then it would be simpler to copy values. — EdM, Dec 25 '21 at 20:32
Hi EdM, sorry for the unclear post, I'm fairly new to this. am i right this time? — Ann Li, Dec 26 '21 at 11:21

score 3 · Accepted Answer · answered Dec 26 '21 at 17:24

You will have to apply your knowledge of the subject matter to decide the most reliable way to proceed. First some general considerations, then some warnings about your application.

In general, with interactions among three predictors there is no single association of any predictor with outcome. The association of one of them with outcome depends on the values of the other two.

For example, the single-predictor coefficient for WORD_LENGTH_SCALE of -0.025104 only holds when the values of the other 2 predictors are 0. Some particular values of Freq_SCALE and Freq_2nd were necessarily assumed to produce the first plot of response versus WORD_LENGTH_SCALE. Presumably, the assumed values of those other predictors were such that the interactions of WORD_LENGTH_SCALE with the other predictors led to a net positive slope.

I don't use sjPlot and the details of your prediction method aren't completely clear, but what you have to do is independent of those details. To understand the plots that you show, you need to read the help pages to find out the particular assumptions those methods are making about all of the predictors. Your plots show at most values of the predictor along the horizontal axis and one other predictor. You need to know values of all 3 predictors to see if the predictions make sense.

For comparison, use your coefficients to write the point estimates of response from your model directly in terms of WORD_LENGTH_SCALE (ws), Freq_SCALE (fs), and Freq_2nd (f2):

response <- function(ws,fs,f2) {4.205600 - (0.025104 * ws) - (0.004509 * fs) - (0.019313 * ws * fs) +  (0.004601 * ws * f2) +  (0.003309 * ws * fs * f2)}

and plug in appropriate numbers.

The most useful way to present results of models with interactions is to illustrate with intelligently chosen combinations of predictor values that are expected in practice. Don't depend on hidden assumptions that software packages make about predictor values that you don't directly specify. Use your knowledge of the subject matter to make sets of explicit choices about all 3 predictors that will best illustrate your results to your audience. Software should be able to make predictions based on your own choices of values.

Now for the warnings. First, if WORD_LENGTH_SCALE (or any other predictor) is derived from only a small set of integer values, you probably don't want to show plots that imply a continuous relationship with outcome. (You might even consider whether modeling as a single continuous predictor makes sense.) Separate plots and standard errors for individual values would be preferable.

Second, you say:

Freq_2nd is not applicable to a word length condition. That is, when the word length = 1, Freq_2nd is NA. Thus, I wrote is as (WORD_LENGTH_SCALE:Freq_2nd).

I'm not sure that your model handles this situation properly. In general, software will remove any data rows that have NA values in any variable included in the model before fitting. You might need to tweak your model and your data coding accordingly.

Thank you very much!! It is very helpful! Yes, you are right, I just found out that the model removed all NAs. Do you think it will work if I transform NA to a number, like 0 which did not occur in some other condition? — Ann Li, Dec 29 '21 at 08:53
@AnnLi with care that should work; see [this answer](https://stats.stackexchange.com/a/6565/28500). You might need an extra indicator variable marking cases for which `Freq_2nd` values make sense. Coding `word_length` as a factor instead of continuous with `word_length=1` as the reference, or subtracting 1 from all `word_length` values so that a `word_length` of 1 leads to a net predictor value of 0 might work, as those put `word_length=1` directly into the intercept as needed for that trick. Might be trickier with interactions; think through the interpretation of all your coefficients first. — EdM, Dec 29 '21 at 15:17
Thank you for your great help! I have a trick question. What if I had another categorical factor. For example, in group A, when word_length = 1, Freq_2nd is NA. However, in group B, Freq_2nd is NA regardless of the length of the word. In this case, can I still subtract 1 from all word_length values? Or just subtract word_length in group A? Because my model is complex and large, also for research purposes, it is not a good idea to set word_length as a factor. — Ann Li, Dec 30 '21 at 12:59
@AnnLi with a more complex situation you need to try the "extra indicator variable" method, with an indicator that is 0 when `Freq_2nd` values aren't applicable and 1 when they are. Then keep `word_length` in its natural scale and set `Freq_2nd` to 0 for cases where `Freq_2nd` isn't applicable. That's an extension of the method in the [answer linked above](https://stats.stackexchange.com/a/6565/28500). That seems like it should work, but think through the interpretation of all the coefficients and interaction terms in your model to check, first. — EdM, Dec 31 '21 at 15:40

What is the most reliable way to visualize the mixed model?

1 Answers1