1

I'm using Kruskal-Wallis to as a non-parametric ANOVA of a non-normal distribution. My understanding is that it assumes each group of the independent variable has the same shape. As it turns out, three of my four groups have approximately the same shape, however the fourth does not. The fourth is roughly the inverse of the first two.

Does this invalidate significance? Are there any steps I should take to mitigate its impact?

Update based on comment and further steps taken:

@Miroslav Sabo pointed me to a similar question he asked about using Mann-Whitney when certain assumptions are not met. That answer suggested welch's correction.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Thain
  • 53
  • 5
  • This may be helpful http://stats.stackexchange.com/questions/65405/what-to-do-when-a-mann-whitney-u-assumption-is-violated#comment126284_65405 – sitems Jul 27 '13 at 10:06
  • Hi Miroslav. Your link is helpful. However, my data has four groups rather than two, hence Kruskal-Wallis rather than Mann-Whitney. Are there tests equivalent to Welch's correction that work for more than two samples? – Thain Jul 27 '13 at 10:31
  • 2
    Mann-Whitney and its direct extension to multiple groups Kruskal-Wallis do _not_ require the distributions be of the same shape. The tests compare gravity locations, assess stochastic dominance. But if you choose to add the equality-of-shapes assumption then the tests are about the difference in shift (i.e. in any quantile's position). – ttnphns Jul 27 '13 at 11:29
  • I deleted the duplicate material about `wtest`, already answered in another question. – Nick Cox Jul 27 '13 at 13:18

1 Answers1

4

As ttnphns commented, neither Kruskal-Wallis nor rank sum tests have any assumptions about distributional similarity between groups. There is a point of confusion that somtimes arises in these tests because, while in the most general sense they are tests for stochastic dominance (e.g., H$_{0} \text{: P}(X_{A} > X_{B}) = \frac{1}{2})$, with two additional assumptions—(1) that the distributions are the same shape, and (2) that any differences between the distributions of the groups are differences of central location—the tests can be interpreted as tests for median difference (e.g., H$_{0} \text{: } \tilde{x}_{A} = \tilde{x}_{b}$).

Therefore, significance is not an issue, and there is nothing to "mitigate." However, substantive interpretation (i.e. stochastic dominance versus median, mean, etc. difference) will entail.

Alexis
  • 26,219
  • 5
  • 78
  • 131
  • 4
    +1 ... but: If they're the same shape apart from location shift, it's not just a test for difference in one location-estimator, but all of them (or at least all that exist). – Glen_b Apr 26 '14 at 18:33