7

On the Mathematics site an OP who is just learning statistics gave his description of the difference between linear regression and ANOVA and asked if his interpretation was correct. I responded that linear regression considers how a set of covariates relate to a response in a functional form (could have added "that is linear in the parameters") whereas ANOVA categorizes the response into a class or classes of group(s) and tests for a difference between group means. A member downvoted my answer saying that ANOVA can include continuous predictors as well. His own answer indicated that he was considering the term ANOVA to mean the testing of significant terms from the decomposition of variance in the general linear model. We both gave descriptions of linear regression that agreed.

My question is: "What do you think is the best answer? His answer? My answer? An explanation providing the two meanings of ANOVA? Something else?

Link: https://math.stackexchange.com/questions/183704

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • 1
    Can you provide the link to the discussion? Also we have an [anova tag](http://stats.stackexchange.com/questions/tagged/anova), but I don't have sufficient rep to suggest/create a synonym with your created analysis-of-variance tag, could you create the synonym? – Andy W Aug 18 '12 at 13:28
  • 3
    Also there is a related discussion in a prior question on the site (although not a duplicate), [Why is ANOVA taught / used as if it is a different research methodology compared to linear regression?](http://stats.stackexchange.com/q/555/1036) – Andy W Aug 18 '12 at 13:29
  • It seems like you're talking about two different but related things. Since the other commenter was referencing `R` functions, I think you're talking about `aov()` (http://stat.ethz.ch/R-manual/R-patched/library/stats/html/aov.html) and he's talking about `anova()` (http://stat.ethz.ch/R-manual/R-patched/library/stats/html/anova.lm.html)... – smillig Aug 18 '12 at 13:31
  • 1
    @smillig The discussion was not about computer routines. We just discussed SAS in my case and R in his as support to our positions. It is common to refer to the variance decomposition table in the general linear model as the ANOVA table. But ANOVA is usually differentiated from regression in the way I stated. Thanks for linking the other post. – Michael R. Chernick Aug 18 '12 at 13:59
  • 7
    I can't understand why nobody uses the purpose argument, i.e. that linear regression is a *modelling* method and ANOVA is a *hypothesis testing* framework. I know that they can be extended in a way they overlap, but the canonical formulations are perfectly separated. –  Aug 18 '12 at 14:26
  • 1
    @MichaelChernick I know the discussion wasn't about computer routines, but since the other commentator was referring to `R` functions, I thought the documentation for what each of those things do might shed light on where each of you is coming from (which, as I think mbq is alluding to in his comment, is probably behind the differences between the answers on the original post). I think your answer is correct and definitely didn't deserve to be downvoted. – smillig Aug 18 '12 at 15:15
  • 1
    @smillig Thank you. My references were to the SAS routines which I think show the kind of separation between ANOVA and linear regression that I suggested. – Michael R. Chernick Aug 18 '12 at 15:19
  • If it's any consolation, Michael, others have had experiences like yours on the math site. There seems to be a tendency for downvotes to replace constructive conversation about differences of opinion or interpretation (perhaps because pure mathematics does not seem to allow for any such differences, since from a certain immature perspective (which is present even in some professionals), all answers are either *right* or *wrong*: whence, I would contend, questions like the one you reference don't belong on the math site at all). – whuber Aug 18 '12 at 16:01
  • @whuber: I agree it belongs here. If you and Michael haven't already done so, please consider casting votes to migrate the math.SE question. Your two votes would finish it off. – cardinal Aug 18 '12 at 16:59
  • @whuber One point to keep in mind is that one of the downvotes came from Michael Hardy who I understand is a professional statistician. – Michael R. Chernick Aug 18 '12 at 17:03
  • @cardinal : How does one vote to migrate? There a button for voting to close, but how does one express a desire to migrate? – Michael Hardy Aug 19 '12 at 02:06
  • @whuber : I am one of those who have expressed who have complained of downvotes without any comments saying what's wrong. I've rarely downvoted things, but I think I've always commented when I did so. – Michael Hardy Aug 19 '12 at 02:07
  • @MichaelHardy: Vote to close as off-topic and select stats.SE as the destination. If the five close-voters do this, it should be automatically migrated here. :) – cardinal Aug 19 '12 at 02:09
  • @cardinal : Done. – Michael Hardy Aug 19 '12 at 02:15
  • 2
    In ANOVA, we are trying to find how much of the variance is accounted for our manipulation of the independent variables. In multiple regression, we do not directly manipulate the independent variables, but instead just measure the naturally levels of variables and see if this helps us to predict the dependent variable. However, ANOVA and multiple regression are fundamentally the same since both of them try to explain the variance in the level of one variable on the basis of the level of one or more other variables. Ref: "SPSS for Psychologists: A Guide to Data Analysis Using Spss for Windows" – Stat Aug 19 '12 at 04:42
  • an ANOVA with continuous predictors would usually be called ANCOVA (analysis of co-variance). The GLM (general linear model) framework allows for a principled comparison (the difference between modelling and hypothesis testing nonewithstanding) – jank Oct 24 '13 at 17:39

1 Answers1

2

ANOVA and Linear regression are twin princesses grown in different castles. Please see the book of Andy Field: Discovering statistics using SPSS. He has a very nice explanation including the evolution in time of this two. Anyway put bluntly: they are very similar and developed in parallel for a certain period of time by different scientific communities. NB: Of course the comparison is software independent.

  • Your reference to Andy Field's book is extremely helpful, thank you! The discussion you are referring to is at page ~350 in the 3rd edition. – cmo Oct 02 '14 at 19:14