I am quite new to this copula idea. In particular I am confused about the definition of a Gaussian copula. For a copula to be a Gaussian copula does the marginals have to Gaussian as well? Or it can be of any distribution? From the wikipedia page it looks like it has to be (http://en.wikipedia.org/wiki/Copula_(probability_theory)#Gaussian_copula) but I thought it needn't be.
1 Answers
No, that's the point of the copula.
Consider a random variable $X$ and its CDF $F$. Since $F$ is just a function, you can apply it to $X$ to obtain a new random variable $W \equiv F{(X)}$. It is always true that $W\sim \operatorname{Uniform}{[0,1]}$ when defined this way. (this was actually the content of my very first question here)
Now think of a vector of random variables $(X_1,\dots,X_d)$ with their respective marginal CDFs $F_1,\dots,F_d$. $\left(F_1{(X_1)},\dots,F_K{(X_d)}\right)$ is just a vector of those $W$'s. It's a vector of random variables that are uniformly distributed on $[0,1]$. Keep in mind that these are marginal distributions; we haven't said anything yet about if and how they might depend on each other.
The Gaussian copula is just a multivariate probability distribution defined on the unit square/cube/hypercube $[0,1]^d$. Using what we demonstrated above, it should be apparent that $\left(F_1{(X_1)},\dots,F_d{(X_d)}\right)$ is a function that maps onto $[0,1]^d$. Therefore any such vector could have a Gaussian copula as its distribution.
So when Wikipedia says the Gaussian copula is "$\Phi_R(\Phi^{-1}{(u_1)},\dots,\Phi^{-1}{(u_d)})$", it doesn't mean that the $U$'s are your data. They can be, but they don't have to be. $U_i$ can be freely defined as $F_i{(X_i)}$ where $X_i$ is one of your data variables and $F_i$ is its CDF. So that distribution is equivalent to $\Phi_R(\Phi^{-1}{(F_1{(X_1)})},\dots,\Phi^{-1}{(F_d{(X_d)})})$. If you wanted standard Gaussian marginals, your distribution would be $\Phi_R(\Phi^{-1}{(\Phi{(X_1)})},\dots,\Phi^{-1}{(\Phi{(X_d)})})$ -- i.e. the copula would reduce to a multivariate normal distribution.
This is all laid out in that same article, in the Mathematical Definition section, but it's pretty terse.

- 11,395
- 3
- 49
- 109