Trying to wrap my arms around copulas

Question

This topic is dense with notation that makes things a bit confusing. But is this the correct interpretation?

Suppose we have two jointly distributed random variables – $X$ and $Y$ – of arbitrary (but let's assume known) CDFs. The problem is the joint probability for any pair of values ($x$,$y$) is not simply $F_X(x)F_Y(y)$ because they are not independent. That actual joint distribution seems to be often called $H(x,y)$.

Now, it seems to me that the copula, in the end, is simply a function such that $C(F_X(x),F_Y(y))$ actually maps to the value for $H(x,y)$.

It accomplishes this by sort of running the marginals "backwards", and also baking in the jointness into another joint distribution specified on $[0,1]^{2}$ where the marginals are uniform.

So, in the end, all it is is a mapping that has the right "correction" for jointness, where all you need give it (once you have it calculated) is the naïve values for $F_X(x)$ and $F_Y(y)$, and it delivers up the actual $H(x,y)$? In other words, Sklar's theorem guarantees there is a one-to-one mapping between $F_X(x)$ and $F_Y(y)$ to $H(x,y)$, and the copula captures all that information?

is there a name for $H(x,y)$? why $H$? – develarist Aug 26 '20 at 16:46 — develarist, Aug 26 '20 at 16:46

score 0 · Answer 1 · answered Jul 25 '20 at 01:44

(I originally posted this as a comment. While I think this is more of a comment than an answer, this interesting question has gone two years without any answers, and a short answer is better than no answer.)

Think of the density on the unit square as describing relationships between quantiles of the marginal distributions. If you look at a Gaussian copula for a high, positive correlation, when $X$ has a low quantile (say $0.1$), it’s likely that $Y$ will, too, and it is unlikely that $Y$ will have a high quantile. Ditto for when $X$ has a high quantile (say $0.9$).

score 0 · Answer 2 · answered Jul 25 '20 at 02:02

You probably haven't got an answer because everything you said is correct! As you said, Sklar's Theorem implies that everything about the "jointness" of the distribution is encoded by the copula, so once you know the $F_X(x)$ and $F_Y(y)$ and have a copula, then you know the joint cumulative distribution function.

$$H(x,y) = P(X \le x \quad \& \quad Y \le y) = C(F_X(x), F_Y(y))$$

Or, if things are differentiable and you prefer to express things in terms of densities, you can write the joint density as

$$h(x, y) = \color{red}{\frac{\partial^2 C}{\partial x \partial y}(F_X(x), F_Y(y))}f_X(x)f_Y(y)$$

where you can thing of the red term as the "correction" for non-independence.

Trying to wrap my arms around copulas

2 Answers2

Linked