2

I have a dataset with a total of about 6 groups set up, and there is a minimum of n=150-200 samples per group. Now when I look at the data, its not normally distributed, and the variances are not equal. (eg, smallest standard deviation is 20 and the largest is 60 or simlar). The data is a function of counts, eg. how many times a player jumped during a game.

Now, if there were two groups I'd do a two sample KS test. However, since there are 6 groups, I am wondering if the lack of homogeneous variances will affect my test result if I perform a Kruskal Wallis ANOVA?

In other words, is homogeneity of variances a strict requirement that must be fulfilled for a Kruskal Wallis one way ANOVA?

Or am I wrong in using this test, and there is something else which I have not thought of?

EDIT: I am working with SPSS

EDIT2: Saw this and tried doing it with GLM, but in the end when looking at the model, it ends up being an ANOVA

Rover Eye
  • 535
  • 1
  • 4
  • 15
  • When you did a GLM ... how exactly did you do it? If it came out the same as an ANOVA (same estimates and p-values etc) you're not fitting the right GLM model. You need a GLM for counts (such as a Poisson model or a negative binomial for example). – Glen_b Aug 17 '15 at 19:18
  • @Glen_b I did it in SPSS, wherein I put in all variables I had the data for in the model, searched for any interactions, and removed those without any. Would you like to chat about this? http://chat.stackexchange.com/rooms/27067/discussion-between-re-and-glen – Rover Eye Aug 17 '15 at 19:33
  • @Glen_b when looking more closely at the ANOVA vs GLM, though the significance values are the same, the F statistic is halved in the GLM (about 10) when comparing the ANOVA (about 20). Is this what you meant> – Rover Eye Aug 17 '15 at 19:44
  • sorry can't chat now have to sleep – Glen_b Aug 17 '15 at 20:02

1 Answers1

2

To be optimal, the proportional odds assumption must be satisfied for K-W. This is often a weaker assumption than constant variance. To check the assumption, compute the empirical distribution function for each of the 6 groups, take the $\log\frac{p}{1-p}$ transformation of it, and plot this on the $y$-axis vs. the original values on the $x$ axis; check for parallelism.

K-W can be valid (though without optimal power) if the prop. odds assumption is violated, if you are careful in how $P$-values are computed.

Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
  • Thanks. Any idea if this can be done in SPSS? – Rover Eye Aug 17 '15 at 17:30
  • I've never used SPSS. It is easy to do in R. – Frank Harrell Aug 17 '15 at 17:43
  • @FrankHarrell I was pointed to this answer that I thought could be useful for my problem as well. Unfortunatelly I am not sure and I think that you could help me. Do you mind to check my question? http://stats.stackexchange.com/questions/167463/adjust-means-for-confounding-factors-matlab – gabboshow Aug 18 '15 at 13:29