17

In explaining why uncorrelated does not imply independent, there are several examples that involve a bunch of random variables, but they all seem so abstract: 1 2 3 4.

This answer seems to make sense. My interpretation: A random variable and its square may be uncorrelated (since apparently lack of correlation is something like linear independence) but they are clearly dependent.

I guess an example would be that (standardised?) height and height$^2$ might be uncorrelated but dependent, but I don't see why anyone would want to compare height and height$^2$.

For the purpose of giving intuition to a beginner in elementary probability theory or similar purposes, what are some real-life examples of uncorrelated but dependent random variables?

BCLC
  • 2,166
  • 2
  • 22
  • 47
  • 1
    This doesn't answer your question, but seems relevant: Sometimes a rv and its square are correlated and sometimes uncorrelated. For example, if X is uniform on [0,1], then X and X^2 are uncorrelated. But if X is uniform on [-1, 1], then X and X^2 are uncorrelated. (Draw a picture to help see this.) However, in both cases, X and X^2 are dependent. – Martha Dec 26 '15 at 05:39
  • 1
    @Martha there's a typo in your comment. I think it's the first 'uncorrelated' that should be 'correlated'. ;) – An old man in the sea. Dec 26 '15 at 11:50
  • @Anoldmaninthesea correlated and sometimes correlated? – BCLC Dec 26 '15 at 12:18
  • 1
    @BCLC "if X is uniform on [0,1], then X and X^2 are uncorrelated." Should be "if X is uniform on [0,1], then X and X^2 are correlated.", I think. – An old man in the sea. Dec 26 '15 at 12:25
  • 1
    @Anoldmaninthesea You are correct: Correlated on [0,1], but uncorrelated on [-1,1]. Thanks for pointing out the typo. – Martha Dec 27 '15 at 20:44
  • 1
    When we say that two random variables are correlated we implicitly mean **linearly** correlated. But a dependency might be more complicated than that. – Emil Friedman Dec 31 '15 at 01:03
  • @EmilFriedman you mean correlated is linearly independent? – BCLC Aug 13 '18 at 12:18

4 Answers4

18

In finance, GARCH (generalized autoregressive conditional heteroskedasticity) effects are widely cited here: stock returns $r_t:=(P_t-P_{t-1})/P_{t-1}$, with $P_t$ the price at time $t$, themselves are uncorrelated with their own past $r_{t-1}$ if stock markets are efficient (else, you could easily and profitably predict where prices are going), but their squares $r_t^2$ and $r_{t-1}^2$ are not: there is time dependence in the variances, which cluster in time, with periods of high variance in volatile times.

Here is an artificial example (yet again, I know, but "real" stock return series may well look similar):

enter image description here

You see the high volatility cluster around in particular $t\approx400$.

Generated using R code:

library(TSA)
garch01.sim <- garch.sim(alpha=c(.01,.55),beta=0.4,n=500)
plot(garch01.sim, type='l', ylab=expression(r[t]),xlab='t')
luchonacho
  • 2,568
  • 3
  • 21
  • 38
Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106
6

A simple example is a bivariate distribution that is uniform on a doughnut-shaped area. The variables are uncorrelated, but clearly dependent - for example, if you know one variable is near its mean, then the other must be distant from its mean.

Russ Lenth
  • 15,161
  • 20
  • 53
  • What exactly are the two variables? – BCLC Dec 27 '15 at 00:42
  • The two random variables $X$ and $Y$ whose joint distribution is uniform on the doughnut. For a specific example, consider the joint density $f(x,y) = 1 / 3\pi$ when $1 < x^2+y^2 < 2$ and $0$ otherwise. – Russ Lenth Dec 27 '15 at 00:55
  • Well I guess physics examples are real life. Thanks rvl. Why is your example true? – BCLC Dec 27 '15 at 02:43
  • 4
    Draw a graph of the region where the density is nonzero and think about it. – Russ Lenth Dec 27 '15 at 03:18
6

I found the following figure from wiki is very useful for intuition. In particular, the bottom row show examples of uncorrelated but dependent distributions.

enter image description here

Caption of the above plot in wiki: Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. Note that the correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case the correlation coefficient is undefined because the variance of Y is zero.

luchonacho
  • 2,568
  • 3
  • 21
  • 38
yuqian
  • 1,234
  • 9
  • 8
0

There are two words that you mention in the title of your question that are usually used interchangeably, correlation and dependence, but in the body of your question you restrict the definition of correlation to Pearson correlation, which in my opinion is indeed the appropriate meaning to correlation, when no other detail is provided. However, I believe that what you really want to ask goes beyond linear correlation, towards statistical dependence, that is: When are variables dependent, but independent when measured?

I mean, it's straightforward that a measure of linear association won't catch an association between variables that are associated but not in a linear way. Examples of that are all around us, though a r value of exactly 0 can be hard to find.

However, going back to the broader question that I elaborated, there could be spurious independence. That is, the variables are dependent, but your sampling will suggest that they are independent. I wrote an article about this, and there are scientific papers mentioning this problem too, such as this one.

Controlling for variables can be equivalent to slicing your data. By slicing too much (adjusting for many other variables), it's expected for your two random variables to appear independent. One may say: But I am not adjusting for anything! And the answer is: You don't need to. The collected data may be biased (selection bias) and you're not aware of it.

mribeirodantas
  • 796
  • 3
  • 17