K-S test correcting, or is a $\chi^2$ acceptable?

Question

I have a data set (21 binned values) which I have fitted to a Gaussian with IDL ($\mu=3.825$, $\sigma =0.0377$). I have tried to find the $\chi^2$, getting $21.14$ for $20$ d.f.. Here my understanding is a little shakey; I think this means I can be 60% confident that the distribution is normal (please correct me if I'm wrong, every information source seems to say something different at this point!).

I went on to try and use the K-S test, with peculiar results; when attempted manually, I found $D=0.09958$ (without ordering min->max), when I ordered values I got $D=0.12$. When given to IDL to calculate it gave $D=0.143$, with a 0.97 probability that it fits the normal. These D-values seem to point to very high probability, and I suspect I've muddled something. Have I too few data bins? Should I need to go so far as the K-S if $\chi^2$ gives me a fair result? I'm supplying the data in case it makes more sense than me.

Observed Frequency, Expected frequency (Gauss.)

0,  0.090119938
0,  0.202768898
0,  0.42521436    
1,  0.831075789   
0,  1.51390793   
2,  2.570303594   
2,  4.067199812   
0,  5.998362667   
13, 8.245102466   
9,  10.5629572   
13, 12.61249796   
12, 14.03598269   
12, 14.5583   
10, 14.07358146    
14, 12.68015971  
8,  10.64807109   
8,  8.333804074    
4,  6.079134555    
2,  4.133009011    
0,  2.618888894    
0,  1.54665668

it may have been a generator of some kind. If I use tables, 'probability under $H_{0}$ that $\chi^2$ exceeds listed value' then my $\chi^2$>the critical value for anywhere from 0.99 to 0.5. I can't honestly say I fully understand what this means. — Tan, May 08 '12 at 16:11
There are many puzzling things here. Perhaps the most substantial is that the data document 110 observations whereas the total "expected frequency" is 135.827. Because these disagree, the expected frequencies appear to have been mis-computed. Regardless, the chi-squared test is inapplicable due to the many zero-frequency bins. The remark about ordering is mysterious because it makes no sense in the context of a KS test. For more about the subtleties of chi-square testing, please see http://stats.stackexchange.com/questions/16921/. The histogram exhibits moderate departures from normality. — whuber, May 08 '12 at 17:31

score 1 · Answer 1 · edited May 08 '12 at 17:59

1

$60\%$ makes no sense. The chi -square test compares the observed frequency to the expect frequency under the assumptions of your normal model $N(3.825, 0.03770^2)$. The chi square distribution is approximate not exact under the null hypothesis. Since you have $21$ bins and are estimating $2$ parameters the correct number of degrees of freedom for the chi-square test is $21-2=19$ and not $20$. What the test does is set a null hypothesis that your distribution is normal. You take a critical value from the chi-square distribution to control your type I error at a level alpha (often taken to be $0.05$). If your test statistic exceeds the critical value you conclude that the distribution is not the specified normal and you have a controlled risk of error at level alpha as that is the approximate probability that the test statistic would exceed alpha if you had been sampling from the normal distribution that you specified. If it does not exceed the critical value you cannot reject that the data came from your specific normal distribution. This is different from accepting the distribution because you have not controlled the type II error. There are many normal and non-normal distribution that could generate samples that are the same as yours. This is strictly a case of rejecting/not rejecting a null hypothesis. It does not assign a confidence level to the distribution being normal. Regarding the K-S test , it compares the sample distribution function $F_n(x)$ to the hypothesized cumulative normal distribution function $F(x)$ and finds the maximum discrepancy over all possible values of $x$. To do this you must use all the observation and not just the total in each bin.

edited May 08 '12 at 17:59

answered May 08 '12 at 17:43

Michael R. Chernick

39,640
28
74
143

2

Note that the correct df to use is likely neither 19 nor 20, because I strongly doubt the parameters were estimated via MLE from the binned counts and I suspect the bin cutpoints may have been based on the data. The $\chi^2$ approximation is so exceptionally poor with so many zero bins it hardly merits being called "approximate"; *wrong* would be more accurate. – whuber May 08 '12 at 18:02
@whuber All I know is that the two parameters were estimated and i think that Tan wasn't aware that 20 is not the right numberof degrees of freedom in that case. I hesitated to use the word wrong but in the end chose it so the point would come across. There is not much difference between a chi square 19 df and a chi square 20 df and as I said the chi square distribution is only approximate because of the asymptotic nature of the test. do you know if the degrees of freedom was a point of controvesy between R. A. Fisher and Karl Pearson. – Michael R. Chernick May 08 '12 at 18:19
Pearson invented the test and always estimated parameters by the method of moments and argued vehemently against Fisher on maximum likelihood. – Michael R. Chernick May 08 '12 at 18:19
2

That's not the issue, Michael. That controversy about DF was settled mathematically long ago. Have you read [my related post](http://stats.stackexchange.com/questions/16921)? I agree there's little difference between 19 and 20, but it's not clear even that the correct DF even lies within this range. I think it's actually *much* smaller in this case, to the point that there's moderate evidence that this distribution is not Gaussian. – whuber May 08 '12 at 18:51
The Fisher/Pearson Chi-Squared Controversy: A Turning Point for Inductive Inference Davis Baird The British Journal for the Philosophy of Science Vol. 34, No. 2 (Jun., 1983), pp. 105-118 Published by: Oxford University Press Article Stable URL: http://www.jstor.org/stable/687444 – Michael R. Chernick May 08 '12 at 19:12
@whuber You mentioned maximum likelihood as the method that jsutified the 19 for degrees of freedom and seemed to think that the poster was using a different method of estimation. It occurred to me that Karl Pearson initiated the chi square test and that he would use method of moments to estimate the parameters for the normal distribution. I remember the controversy between Fisher and Pearson on maximum likelihood versus method of moments. So I was wondering how it played out with the chi square test. – Michael R. Chernick May 08 '12 at 22:55
i found an article which pointed out that Fisher introduced the concept of degrees of freedom and argued what the right number should be which I believe Karl Pearson wouldn't accept. They were both very stubborn. I brought this up for historical interest and certainly not intending to imply that the controversy hadn't been settled. Now your comment on the degrees of freedom being much lower is intriguing. What theory supports your claim? I suspect the poster did something wrong in apply the K-S test. But I have no idea what he did. – Michael R. Chernick May 08 '12 at 23:02
If the binning is too large the chi square loses power and that could explain why he doesn't reject normality if the data actually looks like he should. But I don't see how it is a degrees of freedom issue on the null distribution. – Michael R. Chernick May 08 '12 at 23:04
Good points. I am not invoking any theory here. What I discovered *for these data* is that a better, but still imperfect, analysis (by combining the tail bins and normalizing expected counts) barely changes the $\chi^2$ statistic while reducing the df to 9. The result is a p-value of 6.6%, which is (for reasons I gave) probably too large. Also, a histogram of the counts suggests enough skewness to throw doubt on the Gaussian hypothesis. One way to interpret this is to suggest that the original statistic of 21.14 should not be referred to a $\chi^2$ distribution with 19 or 20 df. – whuber May 09 '12 at 14:24
Thanks Bill. I can understand that. I guess that you are talking about its actual distribution based on the data rather than its distribution under the null hypothesis. I think my interpretation is valid. The bins are poorly chosen and that could affect the power of the test and possibly also that the null distribution really is a poor approximation to the asymptotic chi-square. But it doesn't change the asymptotic null distribution from being chi-square 19 df. – Michael R. Chernick May 09 '12 at 14:34
let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/3385/discussion-between-whuber-and-michael-chernick) – whuber May 09 '12 at 14:44
I can't go to chat from my work computer. I thought I was done but i will be happy to listen to what you would like to say on chat tonight when i have access to it. – Michael R. Chernick May 09 '12 at 14:54

K-S test correcting, or is a $\chi^2$ acceptable?

1 Answers1