4

I have two vectors that are of unequal sizes. They contain quantitative data that does not form a normal distribution. I would like to see whether or not they are statistically different. Is there a good program to use in R to achieve this?

I am thinking I should not use t-test (as that assumes normality) or Wilcoxon (as that uses pair groups of equal sizes).

Haitao Du
  • 32,885
  • 17
  • 118
  • 213
user12211991
  • 169
  • 6
  • Can you say more about the variables you're observing? Are they discrete, for example? Bounded? How are they obtained? Also, there are many ways that distributions can differ; what kinds of difference are you interested in? – Glen_b Mar 04 '15 at 06:35
  • Also, what are the sample sizes? – Glen_b Mar 04 '15 at 06:43

1 Answers1

2

There are two tests named after Wilcoxon.

The one you mention is the signed rank test, but there's also the rank sum test, which might well be suitable for your problem. It's sometimes called the Mann-Whitney test.

It can be used to test for a location shift, or a scale shift, or more general alternatives, including stochastic dominance. For more general differences still, you might consider a two-sample Kolmogorov-Smirnov test.

You might also consider a permutation/randomization test (particularly if you're interested in a difference in means).

(In R, both wilcoxon tests can be obtained via wilcox.test, and the two Kolmogorov-Smirnov tests can be obtained via ks.test. Randomization tests are also very easy to carry out.)

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • I think still have the problem described here [Are large data sets inappropriate for hypothesis testing?](http://stats.stackexchange.com/questions/2516/are-large-data-sets-inappropriate-for-hypothesis-testing) – Haitao Du Apr 05 '17 at 17:55