3

For example, for a dataset generated from lognormal distribution, I estimate a normal distribution from that data (using maximaum-likelyhood or whatever methods).

Then, can I perform K-S test to check the goodness-of-fit of data fit into the estimated normal distribution?

I'm asking this, because from http://itl.nist.gov/div898/handbook/eda/section3/eda35g.htm, it says:

Perhaps the most serious limitation is that the distribution must be fully specified. That is, if location, scale, and shape parameters are estimated from the data, the critical region of the K-S test is no longer valid. It typically must be determined by simulation.

If this is true, then how can I test the goodness-of-fit with estimated model?

cqcn1991
  • 1,145
  • 1
  • 10
  • 16
  • 1
    Covered many times here. e.g [here](http://stats.stackexchange.com/questions/111268/kolmogorov-smirnov-for-pareto-distribution-on-sample) or [here](http://stats.stackexchange.com/questions/45033/can-i-use-kolmogorov-smirnov-test-and-estimate-distribution-parameters) (etc). Your quote says it can be done and how ("*by simulation*") - find critical values by simulation (or p-values each time if you don't have lots of them). Discussion of how to approach the calculations in particular instances [here](http://stats.stackexchange.com/questions/111693/simulation-of-ks-test-with-estimated-parameters) – Glen_b Jan 31 '16 at 05:25

1 Answers1

4

Here is what you do. Simulate a large number (for example, 10000) samples of normal data of the same size as your original sample using your estimated parameters.

For each sample, construct its estimated parameters and find the K-S statistic. Now count how many of the 10,000 K-S statistics are greater than the K-S statistic from your original fit and divide by 10000. That is your estimated p-value of the goodness-of-fit test.

soakley
  • 4,341
  • 3
  • 16
  • 27
  • Thanks, and I think this is also stated here? https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Test_with_estimated_parameters. Is there any book that I can refer to? I may want a detailed explanation. – cqcn1991 Jan 31 '16 at 05:00
  • And, by `K-S statistic`, I think you mean the `sup|Fn - F|`, i.e. the maximum difference between the empirical and model distribution? – cqcn1991 Jan 31 '16 at 05:12