27

I am learning R and have been experimenting with analysis of variance. I have been running both

kruskal.test(depVar ~ indepVar, data=df)

and

anova(lm(depVar ~ indepVar, data=dF))

Is there a practical difference between these two tests? My understanding is that they both evaluate the null hypothesis that the populations have the same mean.

Ferdi
  • 4,882
  • 7
  • 42
  • 62
JHowIX
  • 405
  • 1
  • 4
  • 8

4 Answers4

37

There are differences in the assumptions and the hypotheses that are tested.

The ANOVA (and t-test) is explicitly a test of equality of means of values. The Kruskal-Wallis (and Mann-Whitney) can be seen technically as a comparison of the mean ranks.

Hence, in terms of original values, the Kruskal-Wallis is more general than a comparison of means: it tests whether the probability that a random observation from each group is equally likely to be above or below a random observation from another group. The real data quantity that underlies that comparison is neither the differences in means nor the difference in medians, (in the two sample case) it is actually the median of all pairwise differences - the between-sample Hodges-Lehmann difference.

However if you choose to make some restrictive assumptions, then Kruskal-Wallis can be seen as a test of equality of population means, as well as quantiles (e.g. medians), and indeed a wide variety of other measures. That is, if you assume that the group-distributions under the null hypothesis are the same, and that under the alternative, the only change is a distributional shift (a so called "location-shift alternative"), then it is also a test of equality of population means (and, simultaneously, of medians, lower quartiles, etc).

[If you do make that assumption, you can obtain estimates of and intervals for the relative shifts, just as you can with ANOVA. Well, it is also possible to obtain intervals without that assumption, but they're more difficult to interpret.]

If you look at the answer here, especially toward the end, it discusses the comparison between the t-test and the Wilcoxon-Mann-Whitney, which (when doing two-tailed tests at least) are the equivalent* of ANOVA and Kruskal-Wallis applied to a comparison of only two samples; it gives a little more detail, and much of that discussion carries over to the Kruskal-Wallis vs ANOVA.

* (aside a particular issue that arises with multigroup comparisons where you can have non-transitive pairwise differences)

It's not completely clear what you mean by a practical difference. You use them in generally a generally similar way. When both sets of assumptions apply they usually tend to give fairly similar sorts of results, but they can certainly give fairly different p-values in some situations.

Edit: Here's an example of the similarity of inference even at small samples -- here's the joint acceptance region for the location-shifts among three groups (the second and third each compared with the first) sampled from normal distributions (with small sample sizes) for a particular data set, at the 5% level:

Acceptance regions for location-differences in Kruskal-Wallis and Anova

Numerous interesting features can be discerned -- the slightly larger acceptance region for the KW in this case, with its boundary consisting of vertical, horizontal and diagonal straight line segments (it is not hard to figure out why). The two regions tell us very similar things about the parameters of interest here.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • 2
    +1. I dared to edit it slightly just to add emphasis where I thought it necessary. Please see now, whether you agree or not. – ttnphns Nov 10 '13 at 08:58
  • @ttnphns thanks for the edit. There are some particular reasons why some of the things you changed were in there, so I may edit some of the original back in. However, perhaps I should make it clearer *why* I wrote it as I had it before. But first I want to think carefully about how best to keep as much of your changes as I can. – Glen_b Nov 10 '13 at 09:01
6

Yes there is. The anova is a parametric approach while kruskal.test is a non parametric approach. So kruskal.test does not need any distributional assumption.
From practical point of view, when your data is skewed, then anova would not a be good approach to use. Have a look at this question for example.

Stat
  • 7,078
  • 1
  • 24
  • 49
  • 5
    I would say that Kruskal-Wallis ANOVA makes relaxed assumptions regarding distributions compared to parametric ANOVA: observations in each group come from populations with *similar shape*. Heteroskedasticity or highly skewed distributions remain as problematic as with traditional tests. – chl Nov 09 '13 at 20:22
  • 2
    How so, @chl ? The ranks aren't changed by skew, and KW is rank based. What am I missing? – Peter Flom Nov 09 '13 at 20:53
  • 6
    @PeterFlom The KW test assumes that sampled populations have identical shape and dispersion, although in most case little departure from those assumptions will not affect the results. When parametric assumptions are met, the test is $3/\pi$ as powerful as one-way ANOVA. Regarding rank-based test statistics, some studies suggest, however, that varying degrees of skewness may inflate nominal type I error rate, see, e.g., [Fagerland and Sandvik (2009)](http://www.ncbi.nlm.nih.gov/pubmed/19247980), or some [other](http://goo.gl/umM0Nn) [references](http://goo.gl/8BGbjH). – chl Nov 09 '13 at 21:17
  • @chl The $H_0$ hypothesis is the equality of the distributions, thus the identical shape assumption is only related to the power, isn't it ? – Stéphane Laurent Nov 09 '13 at 22:54
  • 1
    @StéphaneLaurent If the shapes are not identical it may lead to bad inference. [see my example here](http://stats.stackexchange.com/questions/71452/testing-for-significance-between-means-having-one-normal-distributed-sample-and/74192#74192) – Flask Nov 10 '13 at 05:55
  • @chl, Kruskal-Wallis (like its 2-sample variant Mann-Whitney) makes _no_ distributional assumptions generally (i.e. testing its intrinsic hypothesis about the stochastic prevailing). It's only when you narrow the hypothesis to testing the mean or a quantile, "similar shape" assumption arises. – ttnphns Nov 10 '13 at 08:25
  • @Flask, shapes are obviously identical under $H_0$ (equal distributions) – Stéphane Laurent Nov 10 '13 at 08:40
  • @ttnphns Yes for the type I error. But I think KW is only powerful under such assumptions. – Stéphane Laurent Nov 10 '13 at 08:41
  • @StéphaneLaurent, please see Glen_b's answer. Since in _general case_ the two tests differ in what they test, no issue of power comparison can arise in _that_ case. – ttnphns Nov 10 '13 at 09:03
  • @ttnphns I'm not talking about power comparison, but about the power of KW: this test is made to detect difference between distributions with identical shapes. I mean the test has low power to detect a difference between distributions without this assumption. – Stéphane Laurent Nov 10 '13 at 10:30
  • @ttnphns I was considering a test of the null hypothesis that location parameters are all equal (vs. at least one pair of treatments differ wrt. location--so-called location-shift, as you or Glen pointed out in several earlier comments). Isn't this what we are interested in when switching from parametric one-way ANOVA to a non-parametric alternative? – chl Nov 10 '13 at 10:30
  • @StéphaneLaurent The $3/\pi$ ARE that I mentioned was in reference to one-way ANOVA (normal parent distributions). It is related to power. I take the following from Andrews, FC, [Asymptotic Behavior of Some Rank Tests for Analysis of Variance](http://goo.gl/x3yRSq) (Ann. Math. Statist. 25(4) (1954), 724-736): When compared with ANOVA F-test, the KW H-test has a 95.5% chance of "choosing a sequence of alternative hypotheses that vary with the sample sizes in such a manner that the powers of the two tests for this sequence of alternatives have a common limit of less than 1." – chl Nov 10 '13 at 10:55
  • @chl, I got you and am not raising argument. But I want to stress that, partially, controversy may be due that vague term "location" which people **understand** differently. In _my_ understanding (which is btw not the same as Glen's) KW is _always_ the test of location (which I take broadly as nonparametric gravity accent), but is the test of _shift_ (of mean or quanlile) only under that equality-of-shapes assumption (hence I don't mix the words as in "location shift"). To your `what we are interested... switching from ANOVA to [KW]?` I'd say: sometimes just shift, often - location broadly. – ttnphns Nov 10 '13 at 12:39
4

As far as I know (but please correct me if I'm wrong cause I'm not sure), the Kruskal-Wallis test is constructed in order to detect a difference between two distributions having the same shape and the same dispersion, that is, one is obtained by translating the other by a difference $\Delta$, such as: enter image description here

Let's call $(*)$ this assumption. The KW test tests the null hypothesis $H_0\colon\{\Delta=0\}$ vs $H_1\colon\{\Delta \neq 0\}$. However, the KW test is "valid" without assumption $(*)$ : its signficance level (probability to reject $H_0$ under $H_0)$ is valid because $(*)$ is obviously fulfilled under $H_0\colon\{\text{the distributions are equal}\}$.

But the KW test is "inefficient" if $(*)$ does not hold: it only intend to have a good power to detect $\Delta >0$, and then the test statistic is not appropriate to reflect the difference between the two distributions if there's no such $\Delta$.

Consider the following example. Two samples $x$ and $y$ of size $n=1000$ are generated from two quite different distributions but having the same mean. Then KW fails to reject $H_0$.

set.seed(666)
n <- 1000
x <- rnorm(n)
y <- (2*rbinom(n,1,1/2)-1)*rnorm(n,3)
plot(density(x, from=min(y), to=max(y)))
lines(density(y), col="blue")

enter image description here

> kruskal.test(list(x,y))

    Kruskal-Wallis rank sum test

data:  list(x, y)
Kruskal-Wallis chi-squared = 2.482, df = 1, p-value = 0.1152

As I claimed in the beginning, I'm not sure about the precise construction of KW. Maybe my answer is more correct for another nonparametric test (Mann-Whitney ?..), but the approach should be similar.

Stéphane Laurent
  • 17,425
  • 5
  • 59
  • 101
  • 1
    `Kruskal-Wallis test is constructed in order to detect a difference between two distributions having the same shape and the same dispersion` As mentioned in Glen's answer, the comments and in many other places on this site, it is true but is the narrowed reading of what the test does. `same shape/dispersion` is actually not an intrinsic but is an additional assumption which is used in some and not used in other situations. – ttnphns Nov 10 '13 at 13:59
  • P.S. Your 2nd example does not contradict or refute KW test. The H0 of the test is _not_ `distributions are equal`, it is a mistake to think so. The H0 is only that, figularly, the two points of "condensation of the gravities" do not deviate from each other. – ttnphns Nov 10 '13 at 15:25
  • @ttnphns I believe you, I don't know. But commonly we consider $H_0$ as the equality (see e.g. the article on wikipedia). – Stéphane Laurent Nov 10 '13 at 15:56
  • No, you don't have to believe me... nor very much believe what's written in wikipedia either. Wikipedia is the place dog-eat-dog, it and changes frequently. – ttnphns Nov 10 '13 at 16:16
  • 1
    I just say this is a common belief. According to the help of `krusal.test()` in R, $H_0$ is the equality of the location parameters of the distribution. In practice I think we often use KW to assess a difference between the distributions. Hence we could assume the same shape (as we do in the Gaussian ANOVA case), and apply KW, this makes sense. – Stéphane Laurent Nov 10 '13 at 16:24
  • 1
    Yeah. `the equality of the location parameters of the distribution` is the right formulation (albeit "location" shouldn't be thought of as just a mean or median, in general case). _If_ you assume same shapes, then, naturally, this same H0 becomes "identical distribution". – ttnphns Nov 10 '13 at 16:32
  • I'm not sure to understand this thread: http://stats.stackexchange.com/q/69448/8402 Does that mean that KW does not respect the nominal significance level when $H_0$ is true but other assumptions are not fulfilled ? – Stéphane Laurent Nov 10 '13 at 16:41
1

Kruskal-Wallis is rank based, rather than value-based. This can make a big difference if there are skewed distributions or if there are extreme cases

Peter Flom
  • 94,055
  • 35
  • 143
  • 276