Anova "versus" variance

Question

I think I didn't understand the meaning of ANOVA correctly so here's my question: Isn't it sufficient to calculate the mean and variance for each group manually?

I mean, sure it is possible so what is the benefit of ANOVA then?

Anova *is* about comparing variances. However not those of the *groups*. Rather the overall variance of the response is compared with the variance of the response *within groups*. — Michael M, Jul 26 '19 at 20:55

score 2 · Accepted Answer · answered Jul 26 '19 at 16:29

2

ANOVA extends the t-test to more than two groups. Doing so, it asks a slightly different question to the t-test (phrasing it liberally): "Does putting the data into the different factor levels (groups) make sense?" In a predictive setting, this would mean "Does forming groups make my prediction better (particulary reduce prediction error)?" In an explanatory setting, this means "Are there substantial enough differences between some groups to indicate a non-random grouping of the data?"

If an ANOVA indicates such significant differences between groups, researchers often employ post-hoc tests to identify which groups differ (significantly, after correcting for multiple testing). When doing so, the advantage of ANOVA over t-test indeed is small, IMHO. Also, ANOVA requires homogeneity of variance across all groups, while the t-test (in its more general formulation) can accommodate heterogeneous variances.

In a wider understanding of ANOVA (compatible at least with https://en.wikipedia.org/wiki/Analysis_of_variance), namely as multiple linear regression, possibly with categorical predictor(s), the ANOVA allows for specifying complex designs, and in its output focusses on statistical effect rather than regression coefficients (although mathematically equivalent to multiple linear regression: ANOVA vs multiple linear regression? Why is ANOVA so commonly used in experimental studies?).

answered Jul 26 '19 at 16:29

Carsten

341
2
3

Thank you! So when I perform Anova I will at first get information whether it makes sense or not to divide the data? I think this is a very important question! :) – Ben Jul 26 '19 at 18:22
1

No, the groups are pre-specified. – sjw Jul 26 '19 at 19:57
1

"whether it makes sense or not to divide the data" - from a conceptual step, yes, but not from a procedural one. You should not use ANOVA to decide to do some later analysis without grouping the data. Rather, you use ANOVA to tell you whether your predetermined grouping scheme shows differences larger than expected frequently by chance. – Bryan Krause Jul 26 '19 at 20:01
Thanks! So, for example, I have two groups of individuals and I want to test an effect (e.g. man and woman). I divide the data into such two groups and when I perform an Anova it will tell me if my separation is necessary, or not? – Ben Jul 27 '19 at 15:53
ANOVA will tell you if there is a significant difference between men and women; eg, if there is a difference in the mean height of men and the mean height of women. – sjw Jul 27 '19 at 16:41
(You might say, well I could just use a t-test for that - indeed t-test is a special case of ANOVA.) – sjw Jul 27 '19 at 16:49

score 1 · Answer 2 · answered Jul 26 '19 at 19:53

1

Analysis of variance (ANOVA) is really a misnomer. ANOVA is really fitting a linear model with the group as a categorical variable. Hence the benefits of ANOVA are the things you get from a linear model, which is more than just the variances and means of each group (the latter is descriptive, not a “model”.

Specifically, as mentioned in the answer above, you can specify interactions, find confidence intervals for contrasts, etc.

answered Jul 26 '19 at 19:53

sjw

5,091
1
21
45

That's interesting but now I'm confused as it sounds like a linear mixed model? – Ben Jul 27 '19 at 15:55
A linear mixed model has both fixed and random effects. So a traditional linear model with only fixed effects is a special case of it. – sjw Jul 27 '19 at 16:46
How do I implement categorical variables into a linear model? – Ben Jul 27 '19 at 19:55
$Y_{ij}=\alpha_i+\epsilon_{ij}$. There is a different $\alpha_i$ for each of the $i=1,\dots,k$ levels aka groups. A grouping can be conceptualized as a categorical variable. – sjw Jul 27 '19 at 19:59

Anova "versus" variance

2 Answers2