2

I've conducted three pairwise comparisons of variables and the results were as below:

Analysis of Variance Table

Response: Y
                                    Df  Sum Sq Mean Sq F value    Pr(>F)    
Dataset$A                           1 1140.57 1140.57 156.769 6.395e-11 ***
Dataset$B                           1  168.18  168.18  23.116 0.0001070 ***
Dataset$A:Dataset$B                 1  150.12  150.12  20.633 0.0001982 ***
Residuals                           20  145.51    7.28                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Analysis of Variance Table

Response: Y
                                Df  Sum Sq Mean Sq F value   Pr(>F)    
Dataset$A                       1 1140.57 1140.57 64.7121 1.07e-07 ***
Dataset$C                       1   75.62   75.62  4.2904  0.05148 .  
Dataset$A:Dataset$C             1   35.69   35.69  2.0247  0.17018    
Residuals                       20  352.51   17.63                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Analysis of Variance Table

Response: Y
                                Df  Sum Sq Mean Sq F value Pr(>F)
Dataset$B                       1  168.18 168.184  2.4736 0.1315
Dataset$C                       1   75.62  75.620  1.1122 0.3042
Dataset$B:Dataset$C             1    0.72   0.723  0.0106 0.9189
Residuals                       20 1359.86  67.993

From these tables, which interaction between variables would be said to be more interactive? I've interpreted that there is stronger interaction between A and B because Pr(>F) of Dataset$A:Dataset$B is lower.

Would this be a correct interpretation?

jypark1
  • 21
  • 2

2 Answers2

0

Your interpretation is correct. Pr(>F) says whether interaction effect is significant. If the interaction effect is significant the 'p' value would be necessarily less than 0.05 (for 95%).

Mohanasundaram
  • 616
  • 2
  • 8
0

First of all, you should fit the model with all the terms and all the interaction scores, in your case I guess it will be:

# if it is not in dataset
dataset$Y = Y
anova(lm(Y ~ .*. ,data=dataset[,c("Y","A","B","C")]))

The reason for doing this is to properly estimate the variance and also properly account for the effects from all the variables.

Although the term is an interaction term, having a smaller p-value or larger coefficient doesn't make it more "interactive". Let's take your example of Dataset$A:Dataset$B. It means that by including another variable which is A multiplied by B, the model can explain more variance of Y.

I cannot tell whether your variables are continuous or binary, but let's use a simple example. If A is binary (0/1) and B is continuous, having a significant interaction term means there is evidence that the slope of B vs Y changes in the presence / absence of A. If both are continuous, it can mean that the slope of B vs Y changes as you go from high to low values of A. Again, all these interpretation will only make sense when you know what A,B and C are, and you need to look at the coefficients of the interaction term. You can also check out this question / answer in this post

So I suggest refitting the model, checking whether the interaction still holds. And you can also try to visualize the relationship between your variables by using libraries like sjplot.

StupidWolf
  • 4,494
  • 3
  • 10
  • 26