I am modeling a continuous bivariate distribution of a random vector $(X_1,X_2)$ using a copula. I would like to assess how well I am doing. Given a data sample, I could probably do a bivariate Kolmogorov-Smirnov test (thought it seems nontrivial, as discussed in some of these threads). However, I have an alternative idea:
- Pick a large $n$ and make a grid of weights $w_i=i/n$ for $i=1,\dots,n-1$.
- Obtain $Y_i:=w_i X_1+(1-w_i)X_2$.
- Assess the distributions of $Y_1,\dots,Y_{n-1}$ using the univariate Kolmogorov-Smirnov test.
- If $Y_i$s tend to fail the tests in stage 3. in a large fraction of instances$\color{red}{^*}$, reject the null hypothesis that the sample comes from the hypothesized bivariate distribution.
If they do not, fail to reject the null hypothesis.
Questions:
- I wonder how sensible this idea is and what pitfalls I might be overlooking (aside from coarseness of the grid which may be a problem). If I am indeed overlooking things, a counterexample would be appreciated.
- $\color{red}{^*}$I am not sure what fraction should be large enough to achieve a desired significance level. Is it as simple as $\alpha$ fraction for the nominal $\alpha$ significance, or is it more complicated than that?
- I would also be interested in extending this idea beyond 2 dimensions. Do any qualitatively new problems arise in that? (I do realize the computational time would grow exponentially w.r.t. the number of dimensions and would quickly become prohibitive.)