Computing the covaraince of two random variables given by a linear combination of samples drawn i.i.d from same distribution

Question

Suppose we want to compute the covariance between $z_1$ and $z_2$ given as follows:

$$ z_1 = a_1 \cdot x_1 + a_2\cdot x_2 \\ z_2 = a_3 \cdot x_1 + a_4 \cdot x_2\\ % x_1,x_2 \sim p(x) $$

If we compute the covariance between $z_1$ and $z_2$ we have:

$$ \mathbb{E}[z_1\cdot z_2] - \mathbb{E}[z_1]\cdot \mathbb{E}[z_2] = \\ \mathbb{E}[a_1a_3 x_1x_1 + a_1a_4 x_1x_2 + a_2a_3 x_1x_2 + a_2a_4 x_2x_2] - a_1a_3\mathbb{E}[x_1]^2 -a_1a_4\mathbb{E}[x_1]\mathbb{E}[x_2] -a_3a_2\mathbb{E}[x_1]\mathbb{E}[x_2] -a_2a_4\mathbb{E}[x_2]^2 =\\ a_1a_3\text{cov}[x] -a_2a_4 \text{cov}[x] $$

In what I am studying, the last equality holds (or at least that is what I have understood), from the fact that $x_1$ and $x_2$ are independent hence their covariances is zero.

That's not going to be true. Try the following in R: `set.seed(2021); N — Dave, May 17 '21 at 14:17
My question comes from the fact that independent samples have covariance zero. Perhaps I a missing something. Editing my question to make this clearer — jdeJuan, May 17 '21 at 14:23
Welcome to CV, jdeJuan. I wonder if you are losing the distinction between population and sample? @Dave is most definitely giving an example with a sample. — Alexis, May 17 '21 at 14:36
It looks like you goofed somewhere in your calculation, because covariance acts on two variables, not just the one $X$. (And what is $X$?) — Dave, May 17 '21 at 14:38
Hi @Alexis, thank you. I have edited my question to show what I am really interested in. The other is trying to be a simple example of the real thing I am trying to understand. — jdeJuan, May 17 '21 at 14:38
And your functions of $x_1$ and $x_2$ is not helping. Building off @Dave 's example `set.seed(2021); N — Alexis, May 17 '21 at 14:42
And the code @Alexis gave has independent `x1` and `x2`. // In the extreme, consider if $a_1 = a_3$ and $a_2 = a_4$. // Your question about the $a_1x_1 + a_2x_2$ seems quite different from what you originally asked. Perhaps consider editing the title and some of the earlier test in the question. — Dave, May 17 '21 at 14:47
If I set $a_1=a_2=a_3=1, a_4=2$ and do 10,000 replications with a sample size of 10 and that same `set.seed(2021)`, I get a minimum correlation of $0.42$. If I increase the sample size to 100, I get a minimum correlation of $0.88$. What you've proposed breaks down quite spectacularly (which is an excellent learning opportunity), and I am not sure that you're asking what you intend to ask. — Dave, May 17 '21 at 15:14
thanks Dave @Alexis. I have edited my question and title. Hope that you can know understand what is my problem. Note that I am not saying that the overall covariance between $z_1$ and $z_2$ is going to be zero, but how this covariance is computed and the fact that $\mathbb{E}[x_1x_2] - \mathbb{E}[x_1] \mathbb{E}[x_2] = COV(x_1,x_2) = 0$ as $x_1$ and $x_2$ have been drawn independently. — jdeJuan, May 17 '21 at 15:18
What do you mean by $cov[x]$? Covariance applies to two variables. — Dave, May 17 '21 at 15:20
the covariance from the vector $x$, i.e from $p(x)$. Note that $p(x)$ can be or not multivariate — jdeJuan, May 17 '21 at 15:21
https://stats.stackexchange.com/search?q=variance+linear+combination — whuber, May 17 '21 at 15:25

Computing the covaraince of two random variables given by a linear combination of samples drawn i.i.d from same distribution

0 Answers0