Statistical significance test: One way Anova and Kruskal-Wallis test

Question

I was comparing data between two groups and for each group there was six samples. My data are as follows:

group 1: 103.56, 103.32, 103.32, 104.27, 103.56, 103.8

group 2: 97.16, 97.16, 96.69, 98.58, 90.76, 97.64

I ran one way Anova and also kruskal-Wallis test for both the groups. The p-value from Anova was much smaller than 0.05 indicating that at 0.05 significance level there was a significant difference between the data sets of the two groups. The p value from kruskal-Wallis test was 0.3553 (> 0.05) indicating that at 0.05 significance level, there was no significant difference between the two data sets of the groups.

I will really appreciate if someone could give me some insight on why I am getting different p values for the same data set by running these two tests.

They make different assumptions about your data, & those assumptions have value. If you have only 2 groups, why didn't you try a t-test & a Mann-Whitney U-test? What software did you use? I don't get a p-value of .35 for KW. Can you paste your code or output? — gung - Reinstate Monica, Sep 23 '14 at 02:08

score 5 · Answer 1 · edited Apr 13 '17 at 12:44

In general, you wouldn't necessarily expect one way ANOVA and the Kruskal-Wallis to be similar, sometimes they can give quite different p-values. See here for a little partial motivation for why you might expect a difference. [When samples are reasonably normal-looking and with means not too many standard errors apart, they often tend to give similar p-values. Outside that, they frequently don't.]

However, in this case the reason is more prosaic: Your Kruskal-Wallis p-value is wrong.

Here's a summary of results in R (details below).

                     p-value
Welch t-test:        0.001287
Equal-var. t-test:   8.552e-05
One way anova:       8.55e-05 
Wilcoxon test:       0.004847
Kruskal-Wallis:      0.003761

(Neither of the last two p-values are exact; if they were, you'd get the same p-value for the two-group comparison.)

Your problem is you're treating the second group's data as a factor (see the end of this answer).

Here's what I get in R with your data:

frh <- data.frame(group1 = c(103.56, 103.32, 103.32, 104.27, 103.56, 103.8),
                  group2 = c( 97.16,  97.16,  96.69,  98.58,  90.76,  97.64))

# strip chart:

enter image description here

# Welch t-test:
> with(frh,t.test(group1,group2))

    Welch Two Sample t-test

data:  group1 and group2
t = 6.3316, df = 5.163, p-value = 0.001287
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  4.368147 10.245186
sample estimates:
mean of x mean of y 
103.63833  96.33167

$\,$

# equal-variance t-test:
> with(frh,t.test(group1,group2,var.equal=TRUE))

    Two Sample t-test

data:  group1 and group2
t = 6.3316, df = 10, p-value = 8.552e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 4.735411 9.877922
sample estimates:
mean of x mean of y 
103.63833  96.33167

$\,$

#one way anova:
summary(aov(values~ind,stack(frh)))
            Df Sum Sq Mean Sq F value   Pr(>F)    
ind          1 160.16   160.2   40.09 8.55e-05 ***
Residuals   10  39.95     4.0

$\,$

# Wilcoxon-Mann-Whitney:
> with(frh,wilcox.test(group1,group2))

    Wilcoxon rank sum test with continuity correction

data:  group1 and group2
W = 36, p-value = 0.004847
alternative hypothesis: true location shift is not equal to 0

Warning message:
In wilcox.test.default(group1, group2) :
  cannot compute exact p-value with ties

$\,$

# Kruskal-Wallis test:
> kruskal.test(frh)

    Kruskal-Wallis rank sum test

data:  frh
Kruskal-Wallis chi-squared = 8.3958, df = 1, p-value = 0.003761

Those are all about as consistent with each other as I would expect on that data.

Now, here's how to get what you got for the Kruskal-Wallis:

with(frh,kruskal.test(group1,group2))

    Kruskal-Wallis rank sum test

data:  group1 and group2
Kruskal-Wallis chi-squared = 4.3939, df = 4, p-value = 0.3553

The problem is, if you're getting this, you're using it wrong. That's not how the function works - group2 is being treated as a factor defining different groups for data in group1.

So the main reason the Kruskal Wallis isn't giving you a roughly similar p-value to ANOVA is you didn't call it correctly.

Any thoughts on why the p-values differ for MW & KW? I would have thought they'd be identical (as, eg, the t-test & ANOVA are)? — gung - Reinstate Monica, Sep 23 '14 at 02:44
@gung Several things going on. Neither are exact for ties (the wilcox.test output even warns you!) and K-W merely uses the chi-square approximation for the test statistic (its help explains how to get exact p-values by using a function in another package - `coin`). So they're both inexact, and not in the same way; if you do them both exactly, then they should be the same. If I wanted an effectively exact answer, I could run a permutation/randomization test on the ranks far quicker than I could load `coin` and get the syntax right. — Glen_b, Sep 23 '14 at 02:51
@gung In fact the exact two-tailed p-value is 2/924 $\approx$ 0.00216 ... (using `combn` and the sum of the ranks in the first sample) — Glen_b, Sep 23 '14 at 03:00
@gung in fact because the samples don't overlap at all, one can do it with simple mental arithmetic; one need only count how many arrangements into two groups of six there are (keeping in mind the pattern of ties); that will be the denominator, and since only the most extreme arrangement counts in the p-value, for a two-tailed test the numerator must be 2. — Glen_b, Sep 24 '14 at 00:00

Statistical significance test: One way Anova and Kruskal-Wallis test

1 Answers1