2

When speaking about regression,

Why is the $SE_y = SD_y\sqrt{1-r^2}$ ? $r$ is the correlation coefficient .

I can't really see it.

Oleg
  • 454
  • 5
  • 13
  • 1
    "I can't really see it" suggests you seek intuition rather than proof. Which do you want? – Glen_b Aug 15 '16 at 09:50
  • @Glen_b I tried to proove it myself but my but my math didn't work out , I need intuition more :), I guess that will help me and my math. – Oleg Aug 15 '16 at 09:58
  • I posted an extended, detailed explanation of this at http://stats.stackexchange.com/a/71303/919. Your formula is explicitly highlighted as a bulleted item in the *Conclusions* section at the end. Many of the other bulleted items present alternative ways to express the same result. Because these conclusions are all based on purely geometric arguments, it might literally be possible to *see* them. – whuber Aug 15 '16 at 14:50

1 Answers1

2

I am not sure about your notation, however:

$SSR = \sum (\hat y- \bar y)^2 $

$SSE = \sum (y- \hat y)^2 $

$SST = \sum (y- \bar y)^2 $

Using the regression identity: $SSE = SSR + SSE$

It is out of scope but you can read more about the regression identity e.g. here.

It is relatively trivial to show that: $SSE = SST - \frac{SSR}{SST}*SST$, where $\frac{SSR}{SST} = r^2$ and $r^2$ is the coefficient of indetermination. Which ultimately gives: $SSE = SST*(1-r^2)$.

If we assume $SE_y = \sqrt{\frac{SSE}{n-k}}$ we can deduce that it is then a matter to divide by the nobs ($n$) less the number of variables in the model $k$; and then take the sqrt of the right hand and left hand side of the identify above.

$SD_y = \sqrt{\frac{SST}{n-1}} $ where $SD_y$ is just a sample standard deviation of the y's. This means that the relationship suggested by the OP is not entirely correct.

IcannotFixThis
  • 1,151
  • 7
  • 20