Regression vs. ANOVA discrepancy (aov vs lm in R)

Question

I was always under the impression that regression is just a more general form of ANOVA and that the results would be identical. Recently, however, I have run both a regression and an ANOVA on the same data and the results differ significantly. That is, in the regression model both main effects and the interaction are significant, while in the ANOVA one main effect is not significant. I expect this has something to do with the interaction, but it's not clear to me what is different about these two ways of modeling the same question. If it's important, one predictor is categorical and the other is continuous, as indicated in the simulation below.

Here is an example of what my data looks like and what analyses I'm running, but without the same p-values or effects being significant in the results (my actual results are outlined above):

group<-c(1,1,1,0,0,0)
moderator<-c(1,2,3,4,5,6)
score<-c(6,3,8,5,7,4)

summary(lm(score~group*moderator))
summary(aov(score~group*moderator))

summary(lm()) is giving you the coefficients for the contrasts you have specified, which are treatment contrasts in the absence of specification here. While summary(aov()) is giving you the anova table. If you want the anova for the lm model you need anova(lm()) — Matt Albrecht, Dec 20 '11 at 06:17
`group` is a numerical vector, is this on purpose? Normally, grouping factors should have class `factor`, such that the transformation to contrasts can be handled automatically by functions like `lm()`. This will become apparent once you have more than two groups, or use a coding other than 0/1 for your `group` variable. — caracal, Dec 20 '11 at 12:11
See also https://stats.stackexchange.com/questions/268006/whats-the-difference-between-regression-and-analysis-of-variance — kjetil b halvorsen, Dec 10 '19 at 10:52

score 22 · Accepted Answer · edited Apr 13 '17 at 12:44

22

The summary function calls different methods depending on the class of the object. The difference isn't in the aov vs lm, but in the information presented about the models. For example, if you used anova(mod1) and anova(mod2) instead, you should get the same results.

As @Glen says, the key is whether the tests reported are based on Type 1 or Type 3 sums of squares. These will differ when the correlation between your explanatory variables is not exactly 0. When they are correlated, some SS are unique to one predictor and some to the other, but some SS could be attributed to either or both. (You can visualize this by imagining the MasterCard symbol--there's a small region of overlap in the center.) There is no unique answer in this situation, and unfortunately, this is the norm for non-experimental data. One approach is for the analyst to use their judgment and assign the overlapping SS to one of the variables. That variable goes into the model first. The other variable goes into the model second and gets the SS that looks like a cookie with a bite taken out of it. It's effect can be tested by what is sometimes called $R^2$ change or F change. This approach uses Type 1 SS. Alternatively, you could do this twice with each going in first, and report the F change test for both predictors. In this way, neither variable gets the SS due to the overlap. This approach uses Type 3 SS. (I should also tell you that the latter approach is held in low regard.)

Following the suggestion of @BrettMagill in the comment below, I can try to make this a little clearer. (Note that, in my example, I'm using just 2 predictors and no interaction, but this idea can be scaled up to include whatever you like.)

Type 1: SS(A) and SS(B|A)

Type 3: SS(A|B) and SS(B|A)

edited Apr 13 '17 at 12:44

Community

1

answered Dec 19 '11 at 21:42

gung - Reinstate Monica

132,789
81
357
650

1

This is a nice description of the problem. You might clarify the text a bit with this: Type I: SS_A = SS(A) SS_B = SS(B | A) and SS_AB = SS(AB | B, A) Type III: SS_A = SS(A | B, AB) and SS_B = SS(B | A, AB) and SS_AB = SS(AB|A, B) – Brett Dec 19 '11 at 21:48
A very well member of the R community remarked on type II ss: "Some of us feel that type III sum of squares and so-called ls-means are statistical nonsense which should have been left in SAS. -- Brian D. Ripley". This is fortune(54) from the R fortunes package. – Paul Hiemstra Dec 20 '11 at 10:52
1

Thank you so much for your help. I understand now what's going on in terms of how these models are different, but I'm still not clear on when it would appropriate to use either an anova or regression model. My advisor is advising anova, but I've always been taught to use regression and I'm not sure which is more appropriate to use when the results are divergent. Do you have any examples or a resource to advise on when either would be appropriate? Thanks again for your help. – Rebecca Dec 31 '11 at 22:40
1

I'm sorry, I don't quite follow. My point is that the models *aren't* actually different. An ANOVA is a regression with all qualitative predictors. If you have a regression model with continuous and qualitative predictors, and you enter the continuous predictor first, then the qualitative predictors (but without an interaction term) that's ANCOVA. Either approach is fine, since 'behind the scenes' they're identical. I usually code this as a regression, but that's a matter of style. OTOH, if your adviser wants it run ANOVA style, then go that route, as there is no difference. – gung - Reinstate Monica Jan 01 '12 at 05:55
So I guess my confusion is that although I understand that the models are statistically equivalent and I can change the type of SS I'm using to see that, the methods still give me different results. Using regression (Type III SS?), both main effects and the interaction are significant, while using anova (Type I SS?) one main effect is no longer significant. As I understand it from your explanations, the reason these are usually the same but in this case are different is because my variables are correlated(i.e. there is a significant interaction). – Rebecca Jan 02 '12 at 17:38
After your original response, I ran regression models and then compared them using anova().Although the example below does not show my results, it shows the structure of what I did. The results show that model 2 does not explain significantly more variance than model 1 (so the group doesn't predict a significant amount of variance in score once you account for the moderator?). However, model 3 does explain a significant more variance than model 2 (so the interaction term predicts a significant amount of variance after accounting for both main effects?). – Rebecca Jan 02 '12 at 17:48
If this interpretation is correct, I'm not sure whether it is appropriate to report model 1 or model 3. Even though model 1 is simpler, would it be incorrect to report model 3, where both main effects and the interaction are significant? mod1 – Rebecca Jan 02 '12 at 17:49
2

A few things: (3 up) an interaction does not mean your independent variables are correlated, these are just different things; (2 up) if model 3 is significantly better than model 2, then yes, this suggests the interaction is significant (since the interaction is the only thing that differs between them); (1 up) you want to avoid just fishing for significant effects unless you are thinking of your study as a pilot that you will use to plan a subsequent confirmatory study (in this case I think you're OK); I gather you ran this study to look at all three, thus go with model 3. – gung - Reinstate Monica Jan 02 '12 at 19:15
2

In addition, an interaction implies that you should not interpret the main effects, thus presenting only model 1 could be dangerously misleading. If you want more info on types of SS, I wrote a fairly comprehensive answer here: http://stats.stackexchange.com/questions/20452/how-to-interpret-type-i-sequential-anova-and-manova/20455#20455 Also, you should accept one of the answers, at some point, by clicking the check mark next to one of them. – gung - Reinstate Monica Jan 02 '12 at 19:18
Okay, that makes sense now. Thanks for all your help! I really appreciate it. – Rebecca Jan 10 '12 at 05:22

score 11 · Answer 2 · edited Feb 26 '20 at 08:42

11

The results from the aov output are giving you probabilities based on Type 1 sum of squares. This is why the interaction result is the same and the main effects differ.

If you use probabilities based on Type 3 sum of squares then they will match the linear regression results.

library(car)
Anova(aov(score~group*moderator),type=3)

edited Feb 26 '20 at 08:42

Axeman

199
1
11

answered Dec 19 '11 at 20:08

Glen

6,320
4
37
59

6

Linear models and ANOVA will be equivalent when the models are testing the same hypotheses and when the parameterization of the factors is equivalent. So called "Type I" and "Type III" sums are squares are simply tests of different underlying hypotheses (effects of sequential sums of squares versus marginal sums of squares). ANOVA tends to hide some of these decisions as implemented in many packages--a fact that makes me believe that actually setting up and testing the hypotheses of interest through factor parameterization and model comparison in GLM is a superior approach. – Brett Dec 19 '11 at 20:34
+1, I think you have a typo, though. lm is using Type 1 SS and aov is using Type 3 SS. – gung - Reinstate Monica Dec 19 '11 at 21:23
2

Type III (Marginal) Sums of Squares is used by default in lm. AOV would use Type I (Sequential) by default. LM results are invariant to order while aov results depend on the order of the factors. – Brett Dec 19 '11 at 21:28
I thought both lm and aov used type I by default, hence the use of capital A Anova() for type II and III. – Matt Albrecht Dec 20 '11 at 06:14
I seem to remember that type 3 sums of squares are somewhat controversial. See fortune(54), fortune(55) and fortune(56) in the fortunes package for R (http://cran.r-project.org/web/packages/fortunes/index.html). – Paul Hiemstra Dec 20 '11 at 10:50
7

In general, `Anova(..., type=3)` will **not** give you correct type III SS, unless you also switch from treatment contrasts (default in R) to effect coding for unordered factors (`options(contrasts=c("contr.sum", "contr.poly"))`) or some other sum-to-zero contrast codes (e.g., Helmert). This will become apparent once you have unbalanced cell sizes and more than two groups and is also mentioned in the help page for `Anova()`. – caracal Dec 20 '11 at 11:47
Good point @caracal. – Glen Dec 20 '11 at 18:29
@Paul I know when you start mixing continuous and categorical predictors SAS and R will disagree on the type 3 p-values, which probably follows from the controversy. – Glen Dec 20 '11 at 18:29

score -4 · Answer 3 · answered Mar 28 '17 at 11:53

-4

The main difference between linear regression and ANOVA is, in ANOVA the predictor variables are discrete (that is they have different levels). Whereas in linear regression, the predictor variables are continuous.

answered Mar 28 '17 at 11:53

vivek

1
1

3

This is not generally true. – Michael R. Chernick Mar 28 '17 at 12:10
I read it somewhere on the internet. Can you please explain the key difference. I am a newbie. – vivek Mar 06 '18 at 18:49

Regression vs. ANOVA discrepancy (aov vs lm in R)

3 Answers3

Linked