Variance involving two independent variables

Question

hello I have two independent variable P and Q. They are both non-negative. Now I define two new variables on them:

The first variable $$R_1=\alpha P+(1-\alpha)Q.$$ Since P and Q are independent, so $$Var(R_1)=\alpha^2 Var(P)+(1-\alpha)^2Var(Q)$$

The second variable R2 is a sort of compound variable: there is a probability of $\alpha$ that we get P and $1-\alpha$ probability to get Q. I work out the variance as $$Var(R_2)=\alpha E(P^2)+(1-\alpha)E(Q^2)-(\alpha E(P)+(1-\alpha)E(Q))^2$$

My intuition is that $$Var(R_2) \geq Var(R_1).$$ I wonder if anyone could help with the proof of the intuition above?

Here is an example. Let $P=Q=(10,0.5;0,0.5)$ which means there is 0.5 probability to get 10 and 0.5 probability to get 0. Let $\alpha=0.5$. Then $R_1=(10,0.25;5,0.5;0,0.25)$ and $R_2=(10,0.5;0,0.5)$. We get $Var(R_1)=12.5$ and $Var(R_2)=25$.

I tried a couple of other examples and they all show that $Var(R_2)>Var(R_1)$.

Your "sort of compound variable" is called a *mixture.* Formulas for the variance are given at https://stats.stackexchange.com/questions/16608. — whuber, Mar 24 '21 at 11:51

score 5 · Accepted Answer · answered Mar 24 '21 at 11:47

The inequality should follow from the Law of Total Variance. Also, I'm assuming $\alpha \in [0,1]$, as otherwise the construction doesn't make sense.

We need a slightly more formal definition of your compound variable. Let $Z$ be an independent, binary variable with $P(Z=1)=\alpha$, $P(Z=0)=1-\alpha$, and thus we can define $R_2 := Z P + (1-Z)Q$ (I assume this is what you mean).

We consider the variance of $R_2$ using the Law of Total Variance, conditioned on $Z$. I'll use subscripts on $E$ and $Var$ to denote the conditional expectation/variance.

$Var(R_2) = E_Z[Var_Z(R_2)] + Var_Z(E_Z[R_2])$

We want a sufficiently large lower bound on this variance, and we can discard the second term as $\geq 0 $. Then, we can write out the first expectation for the two possible values $Z=1$ and $Z=0$. Recall, if we condition on $Z=1$, then $R_2 = P$.

$ Var(R_2) \geq E_Z[Var_Z(R_2)] = \alpha Var(P) + (1-\alpha)Var(Q)$

Thus, our lower bound is a mixture of $Var(P)$ and $Var(Q)$. Compare this to your variance of $R_1$, and we note that for $\alpha \in [0,1]$, $\alpha \geq \alpha^2$, and $(1-\alpha) \geq (1-\alpha)^2$, which implies

$Var(R_2) \geq Var(R_1).$

While crude, this should be an easy way to understand the intuition behind the bound.

Variance involving two independent variables

1 Answers1

Linked