7

This is an extremely basic question in probability theory. Namely, for any two variables $x$ and $y$, if $\mathrm{cov}(x,y)$ is not 0 (in the population), what does that imply about their joint distribution? In other words, how is the joint distribution of variables related to their covariance?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
ChinG
  • 741
  • 8
  • 24
  • 3
    With a few exceptions, (like multivariate normal), the joint distribution cannot be recovered from the covariance (and the marginals, say). One could make some inequalities, though. – kjetil b halvorsen Oct 29 '15 at 14:26
  • Are the two related at all? Let me rephrase- if cov(x,y)=0, does that imply anything about their joint distribution? Or are these unrelated concepts? I guess (in the discrete case), you could have probability mass for values of x and y, but these values are such that their covariance is 0. – ChinG Oct 29 '15 at 14:28
  • So the nonexistence of a linear association between variables is not informative about their joint density. Well I guess if the joint density equals the marginal for each, this implies independence, which then would imply 0 covariance. – ChinG Oct 29 '15 at 14:30
  • 2
    @ChinG example to make things more complicated: https://en.wikipedia.org/wiki/Normally_distributed_and_uncorrelated_does_not_imply_independent – Tim Oct 29 '15 at 14:32
  • 1
    You should look into copulas!, Then, find a variety of bivariate copulas with zero correlation (well, rank correlation fits better with the copula concept) and see how different they can be. Zero correlation do say something, of course, it rules out all joints with non-zero correlation, a lot, but the ones kept are quite many, still, ... – kjetil b halvorsen Oct 29 '15 at 14:34
  • 1
    Uncorrelated does not imply independence, while the opposite holds. – Xi'an Oct 29 '15 at 15:34
  • 1
    You might find the illustrations of several sets of (differently) uncorrelated normal random variates in [this answer](http://stats.stackexchange.com/questions/162547/why-is-pearsons-%CF%81-only-an-exhaustive-measure-of-association-if-the-joint-distri/162576#162576) interesting (I don't give the mathematical formulas, but it's mostly to convey a sense of what sorts of things can be done). You could fill books with different-looking constructions of uncorrelated normal variates, let alone all the other kinds of distributions. – Glen_b Oct 29 '15 at 16:24

1 Answers1

3

I would ask the opposite question: what is the implications of zero covariance for any surviving dependence? The two variables can certainly be stochastically dependent even though their covariance is zero, but what kind of dependencies, and what kind of bivariate joint distributions, are excluded if covariance is zero?

Some examples:

a) Many-many "named" bivariate continuous joint distributions (i.e. joint distributions where the two marginals belong to the same family).

b) Bivariate continuous distributions of the Farlie-Gumbel-Morgenstern family

$$H_{X,Y}(x,y)=F_X(x)G_Y(y)\left(1+\alpha\big(1-F_X(x)\big)\big(1-F_Y(y)\big)\right), \;\; \alpha <1$$

for two random variables with arbitrary marginal distribution functions $F_X(x)$ and $G_y(y)$. Here $\alpha =0$ is necessary and sufficient for zero covariance and stochastic independence. So here you cannot have the one without the other.

c) Finally one must remember that although covariance is usually described as "reflecting the "linear" dependence among two variables", this may mislead because when a random variable $X$ is a pure non-linear function of another one $Y$, almost always their covariance will be non-zero. Consider a very simple case, let

$$X = Y^2 \implies \text{Cov}(X,Y) = E(XY) - E(X)E(Y) = E(Y^3) - E(Y^2)E(Y)$$

This in general won't be zero.

etc. Certainly, infinite ways to model stochastic dependence with zero covariance do exist, but the above shows that if one wants to go that way, "off-the-shelf" bivariate distributions won't do, nor will the postulation of a purely non-linear relationship.

This is why Copulas is perhaps the way to go, as a comment suggested since, from this perspective, they allow us a systematic way to model dependence with any marginal distributions. This is important, because when looking at data, we can more easily describe a distribution for each data series separately, and it is convenient here to call on our stock of well-known and studied marginal distribution families (and each variable may appear to have a marginal that belongs to a different family). Their joint distributions may be non-standard and so non already studied. Then we look at Copulas to describe the dependence.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241