4

How to derive the following formula:

where $\sigma(y)$ is the standard deviation of $y$ (dependent variable), and $\rho(x,y)^2$ is correlation between $x$ and $y$ squared, $\sigma(\epsilon)$ the standard deviation of the error.

$$ \sigma(\epsilon) = \sigma(y) \sqrt{1-\rho(x,y)^2} $$

if $y = a +bx + \epsilon$.

Comp_Warrior
  • 2,075
  • 1
  • 20
  • 35
Mauro Augusto
  • 489
  • 6
  • 13
  • 1
    The answer at http://stats.stackexchange.com/a/1448 includes an illustration showing this result is merely the Pythagorean Theorem. – whuber Aug 30 '13 at 21:46
  • This reads like a standard textbook question. In what context does this question arise (i.e. what are you doing that it would lead you to ask this?) – Glen_b Aug 31 '13 at 00:14
  • Thank you so much everyone this really helped me out (to Comp_Warrior for editing it and for those who answered). This was a question in the lecture notes I had and I tried for about an hour to get it but because the textbook I use focuses more on the estimates and not true value of a regression I was getting confused. Thanks again. – Mauro Augusto Aug 31 '13 at 10:36

2 Answers2

3

Since $X$ and $\epsilon$ are supposed to be independent and so uncorrelated, you can subtract the means, square and average to give $\sigma^2_Y = b^2 \sigma^2_X + \sigma^2_\epsilon$.

But $b^2 \sigma^2_X = \rho_{X,Y}^2 \sigma^2_Y$ so you can substitute and rearrange to get your result.

Henry
  • 30,848
  • 1
  • 63
  • 107
2

And here is a really explicit derivation:

For simplicity assume that $a=0$ and that your data is mean adjusted: $E[x]=0; E[y]=0$:

$$\epsilon=y-\beta x$$ $$E[\epsilon]=E[y]-E[x]\beta=0$$

So you get the formula for the variance of $\epsilon$: $$Var[\epsilon]=E[(y-\beta x)^2]$$ $$=E[y^2-2y\beta x+\beta^2 x^2]$$ $$=E[y^2]-2\beta E[xy]+\beta^2E[x^2]$$ $$=\sigma_y^2-2\beta \sigma_{xy}+\beta^2 \sigma_x^2$$

And about the slope coefficient $\beta$ you know: $$\beta=\frac{\sigma_{xy}}{\sigma_x^2}$$ Then insert this in the last formula: $$Var[\epsilon]=\sigma_y^2-2\frac{\sigma_{xy}}{\sigma_x^2}\sigma_{xy}+\frac{\sigma_{xy}^2}{\sigma_x^2 \sigma_x^2}\sigma_x^2$$

Then expand the last two terms with $\frac{\sigma_y^2}{\sigma_y^2}$ to get: $$Var[\epsilon]=\sigma_y^2-2\frac{\sigma_{xy}^2}{\sigma_x^2} \frac{\sigma_y^2}{\sigma_y^2}+\frac{\sigma_{xy}^2}{\sigma_x^2 \sigma_x^2}\sigma_x^2 \frac{\sigma_y^2}{\sigma_y^2}$$ $$=\sigma_y^2-2\rho_{xy}^2\sigma_y^2+\rho_{xy}^2\sigma_y^2$$ $$=\sigma_y^2-\rho_{xy}^2\sigma_y^2=\sigma_y^2(1-\rho_{xy}^2)$$

And finally: $$\sqrt{Var[\epsilon]}=\sigma_\epsilon=\sigma_y\sqrt{(1-\rho_{xy}^2)}$$

DatamineR
  • 1,477
  • 3
  • 18
  • 25