5

Say I have two non-negative probability distributions $P(x_1, x_2, \ldots, x_d)$ and $Q(z_1, z_2, \ldots, z_d)$ over two $d$ dimensional spaces.

Define two marginal distributions as: $$P(x_i) = \int^{+\infty}_{-\infty}\underbrace{dx_1\ldots dx_{d}}_{\text{without } x_i} P(x_1, x_2,\ldots ,x_d)$$ $$Q(z_i) = \int^{+\infty}_{-\infty}\underbrace{dz_1\ldots dz_{d}}_{\text{without } z_i} P(z_1, z_2,\ldots ,z_d)$$ Since $\int_{-\infty}^{+\infty}P(x_i)dx_i = 1$ and $\int_{-\infty}^{+\infty}Q(z_i)dz_i = 1$, so let $F(x_i)$ and $G(x_i)$ be CDFs of $P(x_i)$ and $Q(z_i)$ respectively. Based on probability integral transform, we have a one-to-one and onto map $$x_i = F^{-1}(G(z_i)) = f_{i}(z_i)$$

My question is:

If I apply this to each pair of variables $(x_i, z_i)$, at the end of the day, will I get this? $$\begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_d\end{pmatrix} = \begin{pmatrix} f_{1}(z_1) \\ f_{2}(z_2) \\ \vdots \\ f_{d}(z_d)\end{pmatrix}$$

And I think I can also do this with conditional distribution:

Based on $$\int_{-\infty}^{+\infty}dx_j P(x_j|x_i)P(x_i) = \int_{-\infty}^{+\infty}dx_j P(x_j, x_i) = P(x_i)$$ such that $$\int_{-\infty}^{+\infty}dx_j P(x_j|x_i) = 1$$

For $x_1$ and $z_1$, I still find the map $x_1 = f_1(z_1)$ by the approach described above. Then, I just randomly choose a $z_1$ and the corresponding $x_1$, find the two conditional distribution $P(x_2|x_1)$ and $Q(z_2|z_1)$. Let $F(x_2|x_1)$ and $G(z_2|z_1)$ be the CDFs of the two conditional distributions respectively, so I can also find $$x_2 = F^{-1}(G(z_2|z_1)|x_1) = f_2(z_2, z_1)$$ By doing this layer by layer, at the end of the day, what I will have is $$\begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_d\end{pmatrix} = \begin{pmatrix} f_{1}(z_1) \\ f_{2}(z_1, z_2) \\ \vdots \\ f_{d}(z_1, z_2, \ldots, z_d)\end{pmatrix}$$

In this approach, an $x_i$ is not only dependent on $z_i$, which is not like the nice disentangled expression that I have above.

Are there any mistakes in the math that I did? Moreover, simply for the sake of proving a theorem, are there any differences between the two approaches?

meTchaikovsky
  • 1,414
  • 1
  • 9
  • 23

1 Answers1

2

This is an interesting question, somewhat related with copulas. In the first proposal, when defining $$\begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_d\end{pmatrix} = \begin{pmatrix} f_{1}(z_1) \\ f_{2}(z_2) \\ \vdots \\ f_{d}(z_d)\end{pmatrix}$$ the transforms are over the marginals. Therefore, $X_1$ has the correct marginal distribution $P_1$, $X_2$ has the correct marginal distribution $P_2$, &tc. But the transform of the vector ${\bf Z}=(Z_1,\ldots,Z_n)$ carries the correlation structure of this vector into a correlation structure for the vector ${\bf X}=(X_1,\ldots,X_n)$ that is not the original correlation structure (except for rare cases, as when the components are independent for both $\bf X$ and $\bf Z$). This transform fails to reproduce the joint distribution of $\bf X$.

In the second case, the joint distribution of $\bf X$ is correctly preserved: when$$X_1=F_1^{-1}(G_1(Z_1)) \qquad X_2=F_2^{-1}(G_2(Z_2|Z_1)|X_1)$$equivalent to $$X_1=F_1^{-1}(U_1) \qquad X_2=F_2^{-1}(U_2|X_1)$$with $U_1$ and $U_2$ independent ${\cal U}(01,1)$, they satisfy$$X_1\sim F_1(x_2)\qquad X_2|X_1=x_1\sim F_{2|1}(x_2|x_1)$$and hence$$(X_1,X_2)\sim F_{1,2}(x_1,_2)$$ the correct joint distribution.

Comparing with the first proposal, $$X_1=F_1^{-1}(G_1(Z_1)) \qquad X_2=F_2^{-1}(G_2(Z_2))$$is equivalent to $$X_1=F_1^{-1}(U_1) \qquad X_2=F_2^{-1}(U_2)$$with $U_1$ and $U_2$ dependent ${\cal U}(01,1)$.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • Thank you for your great answer! I'm new to statistics, so it will take a while for me to totally digest what you have answered. – meTchaikovsky May 07 '18 at 11:11