I have a question regarding the A-D test, and perhaps goodness-of-fit tests altogether. I fitted a dataset to a long list of distributions. According to A-D, a Wakeby distribution provides the closest fit. In Mathematica, I specified the Wakeby distribution with the fitted parameters, and then ran all of the available goodness-of-fit tests. This gave me a table of p-values:
$$\begin{array}{c|c|} & \text{p-value} \\ \hline \text{Anderson-Darling} & 0.98531 \\ \hline \text{Cramér-von Mises} & 0.98686 \\ \hline \text{Kolmogorov-Smirnov} & 0.98496 \\ \hline \text{Kuiper} & 0.98672 \\ \hline \text{Pearson chi-squared} & 0.99854 \\ \hline \text{Watson U-squared} & 0.99670 \\ \hline \end{array}$$
These are very high p-values across the board, which suggests that the data fits the distribution extremely closely. However, I've done some reading, and I understand that there are some pitfalls when using the A-D test if the distribution parameters are estimated from the data. What I don't understand is why, how to avoid them, and whether they apply to all of the other tests mentioned in the table.
I saw this question, where 2 different methods of doing the A-D test in R gave vastly different p-values. One of the answers states that the nortest
result is correct, and goftest
isn't compensating for the fact that the parameters were deduced from the data. I ran the same test from that question in Mathematica to see which result I received, and it gave me the same p-value as goftest
. Could it be that the p-values I'm getting from my data flawed in some way? Am I misusing the test?
Also, the Mathematica guide for Anderson-Darling, under "Possible Issues", suggests that using fitted distributions can cause a problem, and that one solution is to use a Monte Carlo simulation with the test. I did so with 100,000 samples, which gave me a p-value of 0.98592, very close to the original value. My assumption then is that the p-value is reliable. Am I correct to assume that?
In general, what is the problem with using a goodness-of-fit test when the distribution parameters are fitted to the data, and how can I avoid it?