This topic is dense with notation that makes things a bit confusing. But is this the correct interpretation?
Suppose we have two jointly distributed random variables – $X$ and $Y$ – of arbitrary (but let's assume known) CDFs. The problem is the joint probability for any pair of values ($x$,$y$) is not simply $F_X(x)F_Y(y)$ because they are not independent. That actual joint distribution seems to be often called $H(x,y)$.
Now, it seems to me that the copula, in the end, is simply a function such that $C(F_X(x),F_Y(y))$ actually maps to the value for $H(x,y)$.
It accomplishes this by sort of running the marginals "backwards", and also baking in the jointness into another joint distribution specified on $[0,1]^{2}$ where the marginals are uniform.
So, in the end, all it is is a mapping that has the right "correction" for jointness, where all you need give it (once you have it calculated) is the naïve values for $F_X(x)$ and $F_Y(y)$, and it delivers up the actual $H(x,y)$? In other words, Sklar's theorem guarantees there is a one-to-one mapping between $F_X(x)$ and $F_Y(y)$ to $H(x,y)$, and the copula captures all that information?