Why OLS results differ from 2-way ANOVA of model?

Question

I see examples here and here, where first OLS is performed and then the resulting model is subjected to anova_lm

With crop_yield.csv data, following is the output of OLS:

Analyzing: Yield ~ Fert*Water
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  Yield   R-squared:                       0.435
Model:                            OLS   Adj. R-squared:                  0.330
Method:                 Least Squares   F-statistic:                     4.112
Date:                Fri, 17 Jul 2020   Prob (F-statistic):             0.0243
Time:                        23:06:07   Log-Likelihood:                -50.996
No. Observations:                  20   AIC:                             110.0
Df Residuals:                      16   BIC:                             114.0
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==========================================================================================
                             coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------------------
Intercept                 31.8000      1.549     20.527      0.000      28.516      35.084
Fert[T.B]                 -1.9600      2.191     -0.895      0.384      -6.604       2.684
Water[T.Low]              -1.8000      2.191     -0.822      0.423      -6.444       2.844
Fert[T.B]:Water[T.Low]    -3.5200      3.098     -1.136      0.273     -10.088       3.048
==============================================================================
Omnibus:                        3.427   Durbin-Watson:                   2.963
Prob(Omnibus):                  0.180   Jarque-Bera (JB):                1.319
Skew:                          -0.082   Prob(JB):                        0.517
Kurtosis:                       1.752   Cond. No.                         6.85
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Overall model F( 3, 16) =  4.112, p =  0.0243

Subsequent use of anova_lm gives following output:

res = sm.stats.anova_lm(model, typ= 2)
print(res)

            sum_sq     df     F  PR(>F)
Fert        69.192  1.000 5.766   0.029
Water       63.368  1.000 5.281   0.035
Fert:Water  15.488  1.000 1.291   0.273
Residual   192.000 16.000   nan     nan

In OLS result, neither Fert nor Water showed a significant p-value (although overall model has a p-value of 0.02), while after 2-way ANOVA, both are significant. What is the interpretation of each of these results?

EdM · Accepted Answer · 2020-07-18T19:33:56.887

In addition to the issues of different types of ANOVA, you have to remember that with interactions p-values for ANOVA and p-values for regression coefficients can mean different things because they represent different null hypotheses.

In ANOVA, the null hypothesis is that the predictor is not associated with outcome, tested by an F-test of its contribution to the sum of squares. It tests deviations of the means of the cells in the table from the grand mean. (Just how a predictor contribution is estimated can depend on the type of ANOVA if the design is unbalanced.)

In a linear regression model with interactions and treatment coding of predictors, the null hypothesis on the coefficient for a single predictor is that it equals zero when the other predictors are also at 0 (continuous predictors) or at their reference conditions (categorical predictors). With continuous predictors this means that centering can change single-predictor coefficient p-values even as the interaction is the same (as in your 2 analyses). In your example, although the interaction is not "statistically significant" it is evidently large enough make the individual coefficients apparently insignificant.

To see what's going on with an interaction, consider the following as the result of a simple experiment with a 2 x 2 design similar to your example. Say that you choose to code the independent variables as X1 and X2 with values of 0 or 1, and compare results against what would happen if you chose to code them instead as W1 and W2 with values -1/2 and +1/2. The difference between the 2 levels of each independent variable is still 1 whether you use the X or the W coding. The table shows average values observed for the outcome Y for each combination of the independent variables, and we assume equal numbers of observations in each of the 4 cells.

Mean values in a 2 x 2 design
         | X1 =    0     1
         | W1 =  -1/2  +1/2
------------------------------    
X2 =   0 |            |
         |         0  |  0
W2 = -1/2|            |
---------|--------------------
X2 =   1 |            |
         |         0  |  1 
W2 = +1/2|            |
------------------------------

If you analyzed these results with a linear regression based on X1 and X2, you would get

Y = 0 + 0 X1 + 0 X2 + 1 X1X2.

That is, the intercept is 0, the individual coefficients for X1 and X2 are both zero, and the coefficient for the X1X2 interaction is 1.

Now analyze the same outcomes with linear regression based on W1 and W2 as the independent variables. You get:

Y = 1/4 + 1/2 W1 + 1/2 W2 + 1 W1W2

with a non-zero intercept, substantial coefficients for W1 and W2 individually, and still a coefficient of 1 for the interaction term. Classic ANOVA is done around the grand mean of the observations (1/4 in this example, however the independent variables are coded) and a balanced design, leading to a model equivalent to linear regression based on W1 and W2 as the predictors.

So a coefficient of 0 for X1 or X2 as individual predictors in the first regression doesn't mean that either independent variable is unassociated with outcome. Just centering their values to provide W1 and W2 leads to non-zero individual coefficients. What this means is that with an interaction you can't just look at predictor coefficients in isolation, you have to consider them together with the interactions that involve them.

ANOVA is just a special case of a linear model. It's not inherently "better" than linear regression in this case, it just presents the results in a different way that avoids some complications of interpreting intercepts and single-predictor coefficients when there are interactions.

If you want to evaluate the importance of a predictor along with its interactions in a linear model where ANOVA isn't appropriate, you can do a Wald test incorporating all the coefficients involving the predictor and its interactions, using the coefficient covariance matrix as the basis for the error estimate. This is the approach used in the rms package in R.

Two approaches (ANOVA and regression) should also be related: If coeff is 0, that would indicate no association. Won't that make sense? — rnso, Jul 18 '20 at 01:22
I am primarily interested in whether the predictors are related (associated) with outcome. What you are saying is that ANOVA is better for this approach. However, in biomedical literature, multiple regression is widely used to test this. See comments following answer in this question https://stats.stackexchange.com/questions/477574/regression-for-prediction-versus-understanding-independent-associations/477584?noredirect=1#comment881764_477584 — rnso, Jul 18 '20 at 01:42
@rnso added an example that I hope makes the situation with interactions clearer. With an interaction you can't interpret a single-predictor coefficient in isolation; its apparent "significance" can just depend on how it's coded. You need to consider all coefficients involving the predictor, including interactions. — EdM, Jul 18 '20 at 19:38
That clarifies differences very well. Since interactions are so important, obvious question is: Are ANOVA and regression equivalent if there is no interaction? For `Yield ~ Fert + Water` ? — rnso, Jul 19 '20 at 02:05
@rnso this is where the different types of ANOVA noted at the beginning of my answer come into play. So-called Type I ANOVA goes stepwise, giving the first predictor in the list all the "credit" it can get for its association with outcome. In some types of studies, like with unbalanced designs, that leaves subsequent predictors less possible "credit" for assessing their significance. See [this question](https://stats.stackexchange.com/q/13241/28500). That's not how the predictors would be evaluated in a linear regression. Wald tests provide a safe, general way around such problems. — EdM, Jul 19 '20 at 02:27
Thanks for your explanations. Can't extend comments here. Pl see this related question: https://stats.stackexchange.com/questions/477841/why-should-one-do-wald-test-after-linear-regression — rnso, Jul 19 '20 at 05:01

Why OLS results differ from 2-way ANOVA of model?

1 Answers1