38

Any hard-working student is a counterexample to "all students are lazy".

What are some simple counterexamples to "if random variables $X$ and $Y$ are uncorrelated then they are independent"?

amoeba
  • 93,463
  • 28
  • 275
  • 317
Clare Brown
  • 51
  • 1
  • 2
  • 3
  • 11
    I think this is a duplicate, but I'm too lazy to search for it. Take $X\sim N(0,1)$ and $Y=X^2$. $cov(X,Y)=EX^3=0$, but clearly the two variables are not independent. – mpiktas Feb 04 '14 at 07:13
  • 1
    [a simple example](http://stats.stackexchange.com/questions/41317/how-to-show-operations-on-two-random-variables-each-bernoulli-are-dependent-bu) (though there are perhaps even simpler ones) – Glen_b Feb 04 '14 at 09:03
  • 1
    Take $U$ to be uniformly distributed on $[0,2\pi]$ and $X=\cos U$, $Y = \sin U$. – Dilip Sarwate Feb 04 '14 at 14:34
  • Because the sense of "simplest" is undefined, this question is not objectively answerable. I chose the duplicate at http://stats.stackexchange.com/questions/41317 on the basis of simplest=smallest sum of cardinalities of supports of the marginal distributions. – whuber Feb 04 '14 at 14:55
  • 3
    @whuber: Even though "simplest" is indeed not very well defined, the answers here, e.g. the answer by Glen_b are clearly providing *much* more simple example than the thread you closed this one as a duplicate of. I suggest to reopen this one (I have voted already) and perhaps make it CW to highlight the fact that "simplest" is poorly defined and OP is perhaps asking for various "simple" examples. – amoeba Mar 04 '16 at 22:02
  • @amoeba I think you are correct, so I implemented your suggestions. Thank you for them; I apologize it took me some time to respond. A duplicate question at http://stats.stackexchange.com/questions/199486/real-life-examples-of-difference-between-independence-and-correlation now redirects to this one, too. – whuber Mar 16 '16 at 14:24
  • @whuber Thanks. However, don't you think that the linked question should rather be reopened too? I don't see why it is a duplicate. This one is about simple examples, that one is about real-life examples (perhaps we can make that one CW too). See this Silverfish'es Meta thread http://meta.stats.stackexchange.com/questions/3005/ where the suggestion to reopen that question has most upvotes. If you disagree, can you maybe comment/answer there why? – amoeba Mar 16 '16 at 20:47
  • Related: https://stats.stackexchange.com/q/12842/119261. – StubbornAtom May 25 '20 at 19:02

8 Answers8

24

Let $X\sim U(-1,1)$.

Let $Y=X^2$.

The variables are uncorrelated but dependent.

Alternatively, consider a discrete bivariate distribution consisting of probability at 3 points (-1,1),(0,-1),(1,1) with probability 1/4, 1/2, 1/4 respectively. Then variables are uncorrelated but dependent.

Consider bivariate data uniform in a diamond (a square rotated 45 degrees). The variables will be uncorrelated but dependent.

Those are about the simplest cases I can think of.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Are all random variables which are symmetric and centered around 0 uncorrelated? – Martin Thoma Feb 11 '15 at 09:04
  • 1
    @moose Your description is ambiguous. If you mean "if $X$ is symmetric about zero and $Y$ is symmetric about zero" then no, since a bivariate normal with standard normal margins can be correlated, for example. If you mean"if $X$ is symmetric about zero and $Y$ is an even function of $X$", then as long as the variances exist I believe the answer is yes. If you mean something else you'll have to explain. – Glen_b Feb 11 '15 at 09:58
20

I think the essence of some of the simple counterexamples can be seen by starting with a continuous random variable $X$ centred on zero, i.e. $E[X]=0$. Suppose the pdf of $X$ is even and defined on an interval of the form $(-a,a)$, where $a>0$. Now suppose $Y=f(X)$ for some function $f$. We now ask the question: for what kind of functions $f(X)$ can we have $Cov(X,f(X))=0$?

We know that $Cov(X,f(X))=E[Xf(X)]-E[X]E[f(X)]$. Our assumption that $E[X]=0$ leads us straight to $Cov(X,f(X))=E[Xf(X)]$. Denoting the pdf of $X$ via $p(\cdot)$, we have

$Cov(X,f(X))=E[Xf(X)]=\int_{-a}^{a}xf(x)p(x)dx$.

We want $Cov(X,f(X))=0$ and one way of achieving this is by ensuring $f(x)$ is an even function, which implies $xf(x)p(x)$ is an odd function. It then follows that $\int_{-a}^{a}xf(x)p(x)dx=0$, and so $Cov(X,f(X))=0$.

This way, we can see that the precise distribution of $X$ is unimportant as along as the pdf is symmetric around some point and any even function $f(\cdot)$ will do for defining $Y$.

Hopefully, this can help students see how people come up with these types of counterexamples.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
8

We can define a discrete random variable $X\in\{-1,0,1\}$ with $\mathbb{P}(X=-1)=\mathbb{P}(X=0)=\mathbb{P}(X=1)=\frac{1}{3}$

and then define $Y=\begin{cases}1,\quad\text{if}\quad X=0\\0,\quad\text{otherwise}\end{cases}$

It can be easily verified that $X$ and $Y$ are uncorrelated but not independent.

StubbornAtom
  • 8,662
  • 1
  • 21
  • 67
7

Be the counterexample (i.e. hard-working student)! With that said:

I was trying to think of a real world example and this was the first that came to my mind. This will not be the mathematically simplest case (but if you understand this example, you should be able to find a simpler example with urns and balls or something).

According to some research, the average IQ of men and women is the same, but the variance of male IQ is greater than the variance of female IQ. For concreteness, let's say that male IQ follows $N(100, \sigma^2)$ and female IQ follows $N(100, \alpha \sigma^2)$ with $\alpha<1$. Half the population is male and half the population is female.

Assuming that this research is correct:

What is the correlation of gender and IQ?

Is gender and IQ independent?

Har
  • 1,494
  • 11
  • 15
2

Try this (R code):

x=c(1,0,-1,0);  
y=c(0,1,0,-1);  

cor(x,y);  
[1] 0

This is from the equation of circle $x^2+y^2-r^2=0$

$Y$ is not correlated with $x$, but it is functionally dependent (deterministic).

Glen_b
  • 257,508
  • 32
  • 553
  • 939
Analyst
  • 2,527
  • 10
  • 11
  • 1
    Sample correlation zero does not mean that the true correlation is zero. – mpiktas Feb 04 '14 at 09:04
  • 3
    @mpiktas If those four values represent a bivariate distribution each with probability 1/4, the `cor` function returning zero will indicate a population correlation of zero. – Glen_b Feb 04 '14 at 11:04
  • @Glen_b I should have made better comments on the code. This might not be known to all. You can use semicolons thought I think it is not recommended as a coding style in R. – Analyst Feb 04 '14 at 11:51
  • 1
    @Glen_b yes you are correct. But this was not stated. Nice observation btw. – mpiktas Feb 04 '14 at 14:10
1

The only general case when lack of correlation implies independence is when the joint distribution of X and Y is Gaussian.

  • 3
    This doesn't directly answer the question by producing a simple example - in that sense, it is more of a comment - but it does provide an indirect answer, in that it suggests a very wide set of possible examples. It might be worth rephrasing this post to make clearer how it answers the original question. – Silverfish Sep 23 '17 at 21:46
1

I ran across the example of a short straddle in this "Mini-lesson" by Nassim Taleb.

The payoff has the shape of an inverted V with the peak when the price of the underlying security at expiration is the strike price at which both the call and the put are sold. The idea is that if at the last closing Microsoft shares were \$248.15 and we sell a \$247.50 call for May 21, and a put at the same price and same date, the purchasers will be betting on the price going up (call) or down (put) - i.e. their bets are in opposite directions, but each is betting in the price to move even higher than today's (since the option strike price above the current price will be priced into the option), in the case of the call purchaser; or lower than the strike price (put).

If the price of Microsoft is the strike price the seller of the short straddle cashes in the maximum profit from both unexercised options.

There is a clear dependency between the price of the underlying stock and the profit for the options trader, yet there is a zero correlation because both components move in symmetrical and opposite directions.

Antoni Parellada
  • 23,430
  • 15
  • 100
  • 197
-1

A two-sentence answer: the clearest case of uncorrelated statistical dependence is a non-linear function of a RV, say Y = X^n. The two RVs are clearly dependent but yet not correlated, because correlation is a linear relationship.

John Strong
  • 251
  • 3
  • 6