Can Mann-Whitney test be used for post-hoc comparisons after Kruskal-Wallis?

Question

I have a simulation where an animal is placed in a hostile environment and timed to see how long it can survive using some approach to survival. There are three approaches it can use to survive. I ran 300 simulations of the animal using each survival approach. All simulations take place in the same environment but there's some randomness so it's different each time. I time how many seconds the animal survives in each simulation. Living longer is better. My data looks like this:

Approach 1, Approach 2, Approach 2
45,79,38
48,32,24
85,108,44
... 300 rows of these

I'm unsure of everything I do after this point so let me know if I'm doing something stupid and wrong. I'm trying to find out if there's a statistical difference on lifespan using a particular approach.

I ran a Shapiro test on each of the samples and they came back with tiny p values, so I believe the data isn't normalized.

Data on rows have no relationship to each other. The random seed used for each simulation was different. As a result, I believe the data isn't paired.

Because the data is not normalized, not paired and there were more than two samples, I ran a Kruskal Wallis test which came back with a p-value of 0.048. I then moved on to a post hoc, selecting Mann Whitney. In really not sure if Mann Whitney should be used here.

I compared each survival approach with each other approach by performing the Mann Whitney test i.e. {(approach 1, approach 2), (approach 1, approach 3), (approach 2, approach 3)}. There was no finding of statistical significance between the pair (approach 2, approach 3) using a two tailed test but there was significance difference found using a one tailed test.

Problems:

I don't know if using Mann Whitney like this makes sense.
I don't know if I should be using a one or two tailed Mann Whitney.

Do you have any a priori hypothesis about the relative strength of different approaches (e.g. approach1>approach2>approach3)? This is crucial to answer your questions. — amoeba, Aug 14 '14 at 16:48
I have the mean, median and standard deviation and it appears that approach 3 is better because it has a higher median and mean but it also has a much higher standard deviation so I'm not sure. But I had no way of knowing this before hand. — Phlox Midas, Aug 14 '14 at 17:07
Phlox: if there was "no way of knowing this before hand", you should absolutely **not** use a one-tailed test, only two-tailed (as @Alexis mentioned in his reply as well). — amoeba, Aug 15 '14 at 15:32

Alexis · Accepted Answer · 2017-10-03T21:42:07.893

No, you should not use the Mann-Whitney $U$ test in this circumstance.

Here's why: Dunn's test is an appropriate post hoc test^* following rejection of a Kruskal-Wallis test. If one proceeds by moving from a rejection of Kruskal-Wallis to performing ordinary pair-wise rank sum (i.e. Wilcoxon or Mann-Whitney) tests, then two problems obtain: (1) the ranks used for the pair-wise rank sum tests are not the ranks used by the Kruskal-Wallis test; and (2) the rank sum tests do not use the pooled variance implied by the Kruskal-Wallis null hypothesis. Dunn's test does not have these problems

Post hoc tests following rejection of a Kruskal-Wallis test which have been adjusted for multiple comparisons may fail to reject all pairwise tests for a given family-wise error rate or given false discovery rate corresponding to a given $\alpha$ for the omnibus test, just as with any other multiple comparison omnibus/post hoc testing scenario.

Unless you have reason to believe that one group's survival time is longer or shorter than another's a priori, you should be using the two-sided tests.

Dunn's test can be performed in Stata using dunntest (type net describe dunntest, from(https://www.alexisdinno.com/stata)), and in R using the dunn.test package.

Also, I wonder if you might take a survival analysis approach to assessing whether and when an animal dies based on different conditions?

^* A few less well-known post hoc pair-wise tests to follow a rejected Kruskal-Wallis, include Conover-Iman (like Dunn, but based on the t distribution, rather than the z distribution, implemented for Stata in the conovertest package, and for R in the conover.test package), and the Dwass-Steel-Citchlow-Fligner tests.

Thanks for your answer. Is the Dunn test also known as the Nemenyi-Damico-Wolfe-Dunn test or is that a separate test? — Phlox Midas, Aug 15 '14 at 12:26
I ask because I can't find any implementation of the Dunn test. — Phlox Midas, Aug 15 '14 at 13:02
@PhloxMidas I don't know about the "Nemenyi-Damico-Wolfe-Dunn test," but Wikipedia implies it is an appropriate *post hoc* test following rejection of an omnibus test in a repeated measures design—e.g. following a Friedman test. Also, see my comment about Stata. — Alexis, Aug 15 '14 at 18:04

score 7 · Answer 2 · edited Apr 27 '15 at 23:28

7

A unifying generalization of Kruskal-Wallis/Wilcoxon is the proportional odds model, which admits general contrasts with either pointwise or simultaneous confidence intervals for odds ratios. This is implemented in my R rms package's orm and contrast.rms functions.

edited Apr 27 '15 at 23:28

Glen_b

257,508
32
553
939

answered Aug 15 '14 at 16:31

Frank Harrell

74,029
5
148
322

score 1 · Answer 3 · edited May 03 '15 at 22:59

You can also use the critical difference after Conover or the critical difference after Schaich and Hamerle. The former is more liberal whereas the latter is exact but lacks a bit of power. Both methods are illustrated on my website brightstat.com and brightstat's webapp also lets you calculate these critical differences and perform the post-hoc tests right away. Kruskal-Wallis on brightstat.com

score -1 · Answer 4 · answered Apr 21 '15 at 04:54

-1

If you are using SPSS, do the post-hoc Mann-Whitney with Bonferroni correction (p value divided by the number of groups).

answered Apr 21 '15 at 04:54

Caramba

1

The Mann-Whitney suffers from the two problems I identify in my answer, and is an inappropriate *post hoc* test for Kruskal-Wallis. – Alexis Sep 28 '15 at 17:58

Can Mann-Whitney test be used for post-hoc comparisons after Kruskal-Wallis?

4 Answers4

Linked

Related