Knowing the coefficient of determination between variables $X$ and $Y$ ($r_{XY}^2$), and $Y$ and $Z$ ($r_{YZ}^2$), what is the expected coefficient of determination between variables $X$ and $Z$?
Initially, I thought $r_{XY}^2 \centerdot r_{YZ}^2$ might be a good approximation of $r_{XZ}^2$. I played around with the formulae a bit (see edit below) to get:
$r_{XZ}^2 = \frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Z_i - \bar Z)\bigr)^2}{\sum_{i = 1}^n (X_i - \bar X)^2 \centerdot \sum_{i = 1}^n(Z_i - \bar Z)^2}$
$r_{XZ}^2 = \frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Z_i - \bar Z)\bigr)^2 \centerdot\bigl(\sum_{i = 1}^n (Y_i - \bar Y)^2\bigr)^2 \ }{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Y_i - \bar Y)\bigr)^2\centerdot\bigl(\sum_{i = 1}^n (Y_i - \bar Y)(Z_i - \bar Z)\bigr)^2} \centerdot r_{XY}^2 \centerdot r_{YZ}^2$
I ran thousands of simulations to see how $\frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Z_i - \bar Z)\bigr)^2 \centerdot\bigl(\sum_{i = 1}^n (Y_i - \bar Y)^2\bigr)^2 \ }{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Y_i - \bar Y)\bigr)^2\centerdot\bigl(\sum_{i = 1}^n (Y_i - \bar Y)(Z_i - \bar Z)\bigr)^2}$ behaved.
For large $n$, this was almost always close to $1$ (and so I thought my hunch was right).
However, I discovered that while the distribution of values had a median close to 1, it had a mean consistently in the thousands!
Is there a nice way to generalise this value as a function of $n$ — thereby generating better estimates of $r_{XZ}^2$?
EDIT:
Knowing $r_{XY}^2 = \frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Y_i - \bar Y)\bigr)^2}{\sum_{i = 1}^n (X_i - \bar X)^2 \centerdot \sum_{i = 1}^n(Y_i - \bar Y)^2}$ and $r_{YZ}^2 = \frac{\bigl(\sum_{i = 1}^n (Y_i - \bar Y)(Z_i - \bar Z)\bigr)^2}{\sum_{i = 1}^n (Y_i - \bar Y)^2 \centerdot \sum_{i = 1}^n(Z_i - \bar Z)^2}$
Implies $\sum_{i = 1}^n (X_i - \bar X)^2 = \frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Y_i - \bar Y)\bigr)^2}{r_{XY}^2 \centerdot \sum_{i = 1}^n(Y_i - \bar Y)^2}$ and $\sum_{i = 1}^n(Z_i - \bar Z)^2 = \frac{\bigl(\sum_{i = 1}^n (Y_i - \bar Y)(Z_i - \bar Z)\bigr)^2}{\sum_{i = 1}^n (Y_i - \bar Y)^2 \centerdot r_{YZ}^2}$
Since $r_{XZ}^2 = \frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Z_i - \bar Z)\bigr)^2}{\sum_{i = 1}^n (X_i - \bar X)^2 \centerdot \sum_{i = 1}^n(Z_i - \bar Z)^2}$, then $r_{XZ}^2 = \frac{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Z_i - \bar Z)\bigr)^2 \centerdot\bigl(\sum_{i = 1}^n (Y_i - \bar Y)^2\bigr)^2 \ }{\bigl(\sum_{i = 1}^n (X_i - \bar X)(Y_i - \bar Y)\bigr)^2\centerdot\bigl(\sum_{i = 1}^n (Y_i - \bar Y)(Z_i - \bar Z)\bigr)^2} \centerdot r_{XY}^2 \centerdot r_{YZ}^2$