Chi-squared testing for say TWO samples from TWO distributions

Question

Chi-squared test can be used to check the hypothesis, that given sample is from the given theoretical distribution.

Assume as I have two samples (x1, ... xn) , (y1, ... ym) , and two theoretical distributions F1, F2 I want somehow to check the hypothesis that X is from F1 and Y from F2.

Question: Is there some modifications of chi-squred test to that situation of two samples and two distributions ?

Of course, we can do two tests, but the question is how to combine the results ? For example the first test gives "YES", the second "NO". That it is why I want something like a joint test.

score 1 · Accepted Answer · answered Jun 25 '18 at 15:04

1

If the two sets of values are independent, for a joint test like that, you could easily just combine the two chi-squared statistics -- add the statistics, add their d.f., and the result is again distributed as chi-squared under the null hypothesis. However, I'd generally avoid the use of chi-squared tests for testing distributional fits.

More generally there are any number of other ways you could combine independent tests. The most obvious (and most common) way is Fisher's method, which would add minus twice the logs of the p-values, yielding ... another chi-squared test.

answered Jun 25 '18 at 15:04

Glen_b

257,508
32
553
939

Thank you! Is there some reference for adding statistics? Adding d. f. do you mean we should take n+m-1 or n+m-2? – Alexander Chervov Jun 25 '18 at 19:03
What would suggest to use as a test for distribution fit? – Alexander Chervov Jun 25 '18 at 19:05
1. The fact that the sum of two independent chi-squared variables is itself chi-squared with the sum of the dfs would likely date back at least to Helmert (so ... nearly 150 years ago I'd guess -- and so I don't have a reference, outside any standard text for a first course in mathematical statistics). 2. I can't pick which of those two df values (or some other) because you don't state what your d.f.s are for each test individually (it depends on whether there was any parameter estimation and then on how that estimation was done, and on how your statistic was computed. for example ... ctd – Glen_b Jun 26 '18 at 00:06
ctd. Are you binning a continuous distribution?) 3. You also don't give nearly enough information to recommend a particular goodness of fit test. I don't even know if we're dealing with continuous variates. – Glen_b Jun 26 '18 at 00:06
Sorry for missing details. 1 continuos distributions both 2 no param estimation – Alexander Chervov Jun 26 '18 at 04:49
Well, in absence of information about what the distributions are, and the kinds of deviations that are most important I'd probably look for something with good omnibus power -- I guess I'd lean toward two Anderson-Darling tests combined using the Fisher method, or perhaps two Kolmogorov-Smirnovs (combined the same way) if I was worried more about the center than the tails. It depends on what you need power against. – Glen_b Jun 26 '18 at 07:47
Thank you again ! Would you be so kind to take a look at: https://stats.stackexchange.com/questions/353218/choosing-bins-for-chi-square-testing-distributional-fits-for-distribution-simila – Alexander Chervov Jun 26 '18 at 09:13

Chi-squared testing for say TWO samples from TWO distributions

1 Answers1