4

The Mann-Whitney test requires homogeneity of variance if a median difference is suppossed to be statistically significant.

In case homogenity of variance is not met, but the test is significant: Which aspects of the test can I report?

Ferdi
  • 4,882
  • 7
  • 42
  • 62
  • 1
    See also http://stats.stackexchange.com/q/113334/3277 - a very similar question with interesting answers and comments. – ttnphns Aug 27 '14 at 13:40

1 Answers1

4

You can interpret the $U$ (rank sum) test as a test for stochastic dominance. In such a case, the null hypothesis is not H$_{0}\text{: }\tilde{\mu}_{A} = \tilde{\mu}_{B}$ (i.e. equal medians), but H$_{0}\text{: P}\left(X_{i} > X_{j}\right)=0.5$ for all $i,j \in \{1,\dots,k\}$ for $k$ groups, assuming (per Scortchi's comment) that the CDFs do not cross (i.e. there is stochastic equality among all groups), and H$_{\text{A}}\text{: P}\left(X_{i}>X_{j}\right) > 0.5$ for at least one $i \ne j$.

Failing to reject the null in such a case means you found no evidence of stochastic dominance. Rejecting the null in such a case means you did.

Alexis
  • 26,219
  • 5
  • 78
  • 131
  • 3
    It is important to not compute the $P$-value in a way that assumes the two distributions are identical under the null hypothesis, in order to obtain a powerful test for stochastic dominance. One way to do this is to use the general $U$-statistic standard error as computed in the R `Hmisc` package's `rcorr.cens` function. – Frank Harrell Aug 07 '14 at 15:54
  • Thank you both very much for your answers, really appreciate your knowledge! @ Alexis: What would I report if I found evidence for stochastic dominance? (e.g. is it valid to report the U-statistic?) @Frank: Unfortuntely, I am using SPSS and am unfamilir with R. Is there any chance to provide a formular that is used to calculate this "standard error"? What exactly is it? –  Aug 07 '14 at 16:19
  • 2
    NB Stochastic dominance of group $i$ over group $j$ means $F_{X_i}(x)Xj)>0.5$ without the additional assumption that the cdfs don't cross. See [non-transitive dice](http://en.wikipedia.org/wiki/Nontransitive_dice). – Scortchi - Reinstate Monica Aug 07 '14 at 16:49
  • The formula is simple but requires a double loop not readily implemented in SPSS. – Frank Harrell Aug 07 '14 at 18:00
  • Thank you, all. this helps a lot! Unfortanutely, I am still puzzled what I can report. The most important open part about the question: Can I report a U-statistic and a p-value saying that e.g. values from group Y are statistically significantly higher compared to group X? p.s. @ Frank: can the values be calculated manually? If so, I would highly appreciate if you could post the generall formular. –  Aug 07 '14 at 18:38
  • @user3669454 Following my answer above, you can report: If *not rejecting* H$_{0}$, "found no evidence that any group stochastically dominates any other group (under the assumption that the population CDFs of each group do not cross)"; or, if *rejecting* H$_{0}$, "found evidence that at least one group stochastically dominates at least one other group (under the assumption that the population CDFs of each group do not cross)." You **cannot** report difference in group means, or difference in group medians without additional (and much more stringent) assumptions. – Alexis Aug 07 '14 at 19:21
  • @Alexis, thank you very much for this explaination. Only two questions remain: 1) Does the assumption regarding CDF equate to the statement that no ties can be present? 2) Is there any quantiative statement I can make if H0 is rejected, e.g. as suggested here by the book source: deviation from H0 = U / m*n [on p. 268 of http://books.google.de/books?id=dPhtioXwI9cC&pg=PA265&dq=Median+Test&hl=de&sa=X&ei=L-fjU-m9F4aG4gStvIDAAg&ved=0CFkQ6AEwBw#v=onepage&q=whitney&f=false) ] –  Aug 08 '14 at 08:00
  • @user3669454 (1) I don't *think* so... since that assumption is about population CDFs, but we getting out of my comfort range there; (2) If you look at the null and alternative hypotheses I provided, you will see that they are stated in quantitative terms (i.e. *probabilities* of events). If you are looking for some way of saying "mean" " or "median", then not without stringent additional assumptions, no. – Alexis Aug 08 '14 at 15:33