7

I am reading a book that in one page it talks about cdf of a random vector. This is from the book:

Given $X=(X_1,...,X_n)$, each of the random variables $X_1, ... ,X_n$ can be characterized from a probabilistic point of view by its cdf.

However the cdf of each coordinate of a random vector does not completely describe the probabilistic behaviour of the whole vector. For instance, if $U_1$ AND $U_2$ are two independent random variables with the same cdf $G(x)$, the vectors $X=(X_1, X_2)$ defined respectively by $X_1=U_1$, $X_2=U_2$ and $X_1=U_1$, $X_2=U_1$ have each of their coordinates with the same cdf, and they are quite different.

My question is:

From the very last paragraph, it says $U_1$ and $U_2$ are coming from the same c.d.f. And then they define $X=(X_1, X_2)$, but they say $X=(X_1, X_2)$ is different from $X=(X_1, X_1)$. I don't really understand why the two $X$ are different.

(i.e. I don't understand why $X=(X_1, X_2)$ and $X=(X_1, X_1)$ are different). Isn't $X_1$ the same as $X_2$, so it doesn't matter whether you put two $X_1$ to form $X=(X_1, X_1)$ or put one $X_1$ and one $X_2$ to form $X=(X_1, X_2)$. Shouldn't they be the same? why does the author says they are "quite different"?

Could someone explain why they are different?

Andre Silva
  • 3,070
  • 5
  • 28
  • 55
john_w
  • 619
  • 6
  • 17
  • 5
    I throw a fair coin and record its outcome with a binary indicator ($X_1$). You throw a fair coin and similarly record its outcome ($X_2$). Is it not obvious those two random variables differ? Nature will not guarantee they always produce the same result, that's for sure! Yet they have identical distributions and are independent. – whuber Jan 23 '14 at 23:34
  • 2
    There are two simple but quite different ways to approach this (well, there are more than two, but I'll mention two). (i) by actually sampling $U_1,U_2$ and just looking at the density of points for the two definitions of $X$ (what software do you have available? Excel? Matlab? R? C?); (ii) Proceeding directly from definitions (i.e. do you know what the definition of the cdf is?). – Glen_b Jan 23 '14 at 23:42

3 Answers3

6

Let us take the simplest example of Bernoulli random variables with parameter $\frac12$. The value of the (joint) CDF $F_{X_1,X_2}(x,y)$ of $X_1$ and $X_2$ is the total probability mass in the southwest quadrant with northeast corner $(x,y)$

  • If $X_1$ and $X_2$ are two independent Bernoulli random variables, then we have four probability masses of $\frac14$ sitting at $(0,0), (1,0), (0,1)$, and $(1,1)$. Hence $$F_{X_1,X_2}\left(\frac12,\frac12\right) = \frac14.$$

  • If $X_2 = 1-X_1$, then we have two probability masses of $\frac12$ sitting at $(1,0)$ and $(1,0)$. Hence $$F_{X_1,X_2}\left(\frac12,\frac12\right) = 0.$$

  • If $X_2 = X_1$, then we have two probability masses of $\frac12$ sitting at $(0,0)$ and $(1,1)$. Hence $$F_{X_1,X_2}\left(\frac12,\frac12\right) = \frac12.$$

Thus, the joint CDF of $X_1$ and $X_2$ does depend on what kind of relationship (if any) they have with each other, and just knowing the common CDF of $X_1$ and $X_2$ (these are marginal CDFs) tells us nothing about the behavior the joint CDF.

Dilip Sarwate
  • 41,202
  • 4
  • 94
  • 200
1

Random objects can have the same distribution and be almost surely different. Take a look:

Can two random variables have the same distribution, yet be almost surely different?

Zen
  • 21,786
  • 3
  • 72
  • 114
0

The key is the difference between joint and marginal distributions. When the book talks about the cdf of a random vector $X=(X_1,X_2)$ they mean the function $$F_X(x_1,x_2):=\mathbb{P}(X_1\leq x_1,X_2\leq x_2 ).$$ As you probably know, if $X_1$ and $X_2$ are independent, this equals $$F_{X_1}(x_1)F_{X_2}(x_2)=\mathbb{P}(X_1\leq x_1)\mathbb{P}(X_2\leq x_2),$$ but, on the other hand if they are not independent, we cannot separate the joint probability into their product like that per definition of dependence/independence.

Indeed, in your example where $X_1=X_2$, we have that $$F_X(x_1,x_2):=\mathbb{P}(X_1\leq x_1,X_2\leq x_2 )=\mathbb{P}(X_1\leq x_1,X_1\leq x_2 )=\mathbb{P}(X_1\leq \min(x_1, x_2))\neq\mathbb{P}(X_1\leq x_1)\mathbb{P}(X_1\leq x_2).$$

Again, the key is that just because two variables have the same marginal distribution, this does not mean their joint distributions with another variable are the same.

ekvall
  • 4,361
  • 1
  • 15
  • 37