CDF of a random vector

Question

I am reading a book that in one page it talks about cdf of a random vector. This is from the book:

Given $X=(X_1,...,X_n)$, each of the random variables $X_1, ... ,X_n$ can be characterized from a probabilistic point of view by its cdf.

However the cdf of each coordinate of a random vector does not completely describe the probabilistic behaviour of the whole vector. For instance, if $U_1$ AND $U_2$ are two independent random variables with the same cdf $G(x)$, the vectors $X=(X_1, X_2)$ defined respectively by $X_1=U_1$, $X_2=U_2$ and $X_1=U_1$, $X_2=U_1$ have each of their coordinates with the same cdf, and they are quite different.

My question is:

From the very last paragraph, it says $U_1$ and $U_2$ are coming from the same c.d.f. And then they define $X=(X_1, X_2)$, but they say $X=(X_1, X_2)$ is different from $X=(X_1, X_1)$. I don't really understand why the two $X$ are different.

(i.e. I don't understand why $X=(X_1, X_2)$ and $X=(X_1, X_1)$ are different). Isn't $X_1$ the same as $X_2$, so it doesn't matter whether you put two $X_1$ to form $X=(X_1, X_1)$ or put one $X_1$ and one $X_2$ to form $X=(X_1, X_2)$. Shouldn't they be the same? why does the author says they are "quite different"?

Could someone explain why they are different?

I throw a fair coin and record its outcome with a binary indicator ($X_1$). You throw a fair coin and similarly record its outcome ($X_2$). Is it not obvious those two random variables differ? Nature will not guarantee they always produce the same result, that's for sure! Yet they have identical distributions and are independent. — whuber, Jan 23 '14 at 23:34
There are two simple but quite different ways to approach this (well, there are more than two, but I'll mention two). (i) by actually sampling $U_1,U_2$ and just looking at the density of points for the two definitions of $X$ (what software do you have available? Excel? Matlab? R? C?); (ii) Proceeding directly from definitions (i.e. do you know what the definition of the cdf is?). — Glen_b, Jan 23 '14 at 23:42

Dilip Sarwate · Accepted Answer · 2014-01-24T02:43:17.527

Let us take the simplest example of Bernoulli random variables with parameter $\frac12$. The value of the (joint) CDF $F_{X_1,X_2}(x,y)$ of $X_1$ and $X_2$ is the total probability mass in the southwest quadrant with northeast corner $(x,y)$

If $X_1$ and $X_2$ are two independent Bernoulli random variables, then we have four probability masses of $\frac14$ sitting at $(0,0), (1,0), (0,1)$, and $(1,1)$. Hence $$F_{X_1,X_2}\left(\frac12,\frac12\right) = \frac14.$$
If $X_2 = 1-X_1$, then we have two probability masses of $\frac12$ sitting at $(1,0)$ and $(1,0)$. Hence $$F_{X_1,X_2}\left(\frac12,\frac12\right) = 0.$$
If $X_2 = X_1$, then we have two probability masses of $\frac12$ sitting at $(0,0)$ and $(1,1)$. Hence $$F_{X_1,X_2}\left(\frac12,\frac12\right) = \frac12.$$

Thus, the joint CDF of $X_1$ and $X_2$ does depend on what kind of relationship (if any) they have with each other, and just knowing the common CDF of $X_1$ and $X_2$ (these are marginal CDFs) tells us nothing about the behavior the joint CDF.

thanks. I think I understand now. Sorry it has been a year lol haha. — john_w, Feb 25 '15 at 18:34

score 1 · Answer 2 · edited Apr 13 '17 at 12:44

1

Random objects can have the same distribution and be almost surely different. Take a look:

Can two random variables have the same distribution, yet be almost surely different?

edited Apr 13 '17 at 12:44

Community

1

answered Jan 23 '14 at 23:35

Zen

21,786
3
72
114

ekvall · Answer 3 · 2014-01-24T09:32:11.930

The key is the difference between joint and marginal distributions. When the book talks about the cdf of a random vector $X=(X_1,X_2)$ they mean the function $$F_X(x_1,x_2):=\mathbb{P}(X_1\leq x_1,X_2\leq x_2 ).$$ As you probably know, if $X_1$ and $X_2$ are independent, this equals $$F_{X_1}(x_1)F_{X_2}(x_2)=\mathbb{P}(X_1\leq x_1)\mathbb{P}(X_2\leq x_2),$$ but, on the other hand if they are not independent, we cannot separate the joint probability into their product like that per definition of dependence/independence.

Indeed, in your example where $X_1=X_2$, we have that $$F_X(x_1,x_2):=\mathbb{P}(X_1\leq x_1,X_2\leq x_2 )=\mathbb{P}(X_1\leq x_1,X_1\leq x_2 )=\mathbb{P}(X_1\leq \min(x_1, x_2))\neq\mathbb{P}(X_1\leq x_1)\mathbb{P}(X_1\leq x_2).$$

Again, the key is that just because two variables have the same marginal distribution, this does not mean their joint distributions with another variable are the same.

CDF of a random vector

3 Answers3