2

I keep reading mixed things about doing t-tests on non-normal data when the sample sizes are large and I don't know what to believe.

I have large samples sizes, over 1000 but the distribution is by far not normal- there. It is very skewed right, there are a lot of zeros in the data because it measures time spent on an activity. (Some people didn't engage in the activity that day).

Is this the right situation to use Wilcoxon Rank Sum Test? Or will t-test hold up?

  • See https://stats.stackexchange.com/questions/69898/t-test-on-highly-skewed-data/69967#69967 for a discussion and some relevant techniques. – whuber Nov 10 '17 at 01:02
  • @catherine It depends on the properties of what you have; it's *probably* large enough that the actual significance levels will be close to what you choose them to be, but the remaining issue is the impact on relative power. If potential effect sizes of interest are small enough that you still have some reasonable risk of non-rejection, power will matter, and then you might have a problem compared to some other options with better relative power ... ctd – Glen_b Nov 10 '17 at 02:00
  • ctd... (the rank sum test is not the only possibility, and what you might choose to do may also depend on the precise hypothesis you're trying to test as well as the properties of the distribution you may be dealing with) – Glen_b Nov 10 '17 at 02:00

0 Answers0