I have a discrete ratio variable (length) from two samples and I'd like to know if the distributions are different or the same. In my case length is discrete because it can only take on integer values (it represents the length of a sequence e.g number of words in a sentence). This post seems to suggest that the KS test shouldn't be used for discrete distributions, but this webpage seems to suggest that the KS test can be used with ordinal data. So I'm confused, can I use the KS test or not? If not, should I use the chi-square two sample test?
Asked
Active
Viewed 22 times
0
-
I lean toward chi-square tests since they are useful for model fitting in many different settings. When I think of count data my mind goes to Poisson or negative binomial regression. Using the empirical sandwich covariance estimator makes the inference robust to misspecification of the distributional form. – Geoffrey Johnson Dec 06 '21 at 02:26
-
1If you were to use the KS without accounting for the pattern of ties it would be highly conservative (which will impact its power). You could use a permutation test to at least get close to the desired significance level, which should help. The chi-squared test is not sensitive to "smooth" alternatives -- e.g. a smooth change in location, variance or skewness, such as say comparing two similarish negative binomials, so it will tend to have low power against the kind of alternatives you're most likely to care about. – Glen_b Dec 06 '21 at 05:01