5

suppose to have sample from 3 groups A,B,C. The hypothesis H0: the mean of the 3 groups is the same can be tested using 3 independent t test.

test1: mean(A)=mean(B) level 0.05
test2: mean(B)=mean(C) level 0.05
test3: mean(A)=mean(C) level 0.05

It's known that we should prefer an ANOVA test because with the previous method there is an increasing risk of type 1 error.

I would like to have an example with simulated data where the first procedure lead us to an error while the ANOVA return the good result.

The best would be an R code to simulate the experiment.

Thanks!

Donbeo
  • 3,001
  • 5
  • 31
  • 48
  • Just FYI, I've posted a rather elaborate question about the limiting case in which the one-way $F$ achieves $p<.05 .10="" a="" all="" already="" anova="" answer="" are="" as="" be="" but="" by="" can="" data="" enough="" from="" have="" independent="" insignificant="" margin="" much="" multiple="" no="" of="" on="" only="" pairwise="" possible.="" potentially="" question:="" safe="" same="" say="" smaller="" sometimes="" that="" the="" those="" to="" values="" vs.="" will="" yet=""> – Nick Stauner Mar 20 '14 at 20:45

1 Answers1

4

The following code will replicate a situation where three groups are randomly generated from the same normal distributions, N(0,25). Here, one of the t-tests commits a Type I error that is not commited by ANOVA on the same data.

set.seed(270)
As = rnorm(5, mean = 0, sd = 5)
Bs = rnorm(5, mean = 0, sd = 5)
Cs = rnorm(5, mean = 0, sd = 5)

dat = data.frame(factor = c("A","A","A","A","A","B","B","B","B","B", "C","C","C","C","C"),
                 response = c(As, Bs, Cs))

summary(aov(response ~ factor, data = dat))
t.test(As, Bs)
t.test(Bs, Cs)
t.test(As, Cs)

Anova output:

            Df Sum Sq Mean Sq F value Pr(>F)
factor       2  88.88   44.44   2.233   0.15
Residuals   12 238.82   19.90 

T-test output:

data:  As and Bs
t = -0.9327, df = 7.42, p-value = 0.3803

data:  Bs and Cs
t = -1.0132, df = 4.968, p-value = 0.3577

data:  As and Cs
t = -2.7043, df = 5.666, p-value = 0.03746*

So the t-test detects a significant difference between groups A and C ($\alpha = 0.05$) committing a Type I error. ANOVA, correctly, suggests there is not enough evidence of a significance difference between the groups.

Underminer
  • 3,723
  • 1
  • 20
  • 36
  • 1
    As an additional note, if you do `p.adjust( c(0.3803, 0.3577, 0.03746) )` in R then you see adjusted p-values that are more in line with the Anova result. Just confirms that this is an example of inflated alpha from multiple comparisons. – Greg Snow Mar 20 '14 at 21:11
  • 1
    so I suppose that if I run this code with random seed a lot of time I will get more error using the 3 t-test and less error using anova? – Donbeo Mar 20 '14 at 22:53
  • Yes, I had to change the seed a good many times to get the desired results and the lowest t-test p-value is usually a good bit lower than the F-test p-value. – Underminer Mar 21 '14 at 01:53