1

I had thought a property of correlation is that it is invariant to scaling and offsets $\hat{y} = x\hat{\beta}_1 + \hat{\beta}_0$, so $\hat{y}$ is just a scaled and offset version of $x$.

So shouldn't $corr(y, \hat{y}) = corr(y, x)$? Where does the absolute value come from?

student010101
  • 334
  • 2
  • 10
  • There's a derivation of $\text{Cor}(aX+b, Y) = \text{sgn}(a) \, \text{Cor}(X,Y)$ in the question [The equivalence of sample correlation and R statistic for simple linear regression](https://stats.stackexchange.com/q/99669/22228) – Silverfish Apr 12 '21 at 00:04
  • @Silverfish I was literally looking your answer in that post earlier today for a different reason (mainly paid attention to the vector space part) – student010101 Apr 12 '21 at 01:01
  • @Silverfish This is one of those cases where I blindly did $\frac{a}{\sqrt{a^2}} = 1$ instead of $sign(a)$. – student010101 Apr 12 '21 at 01:02

1 Answers1

2

Correlation is invariant to offsets and positive scaling. Negative scaling flips the sign.

If the correlation between $y$ and $x$ is negative, then $\hat\beta_1<0$, so $$\mathrm{corr}[x\hat\beta_1,y]=-\mathrm{corr}[x,y]=\left|\mathrm{corr}[x,y]\right|$$

Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73
  • Is it possible for the correlation between $\hat{y}$ and $y$ to be negative in cases where you don't include the intercept? I know that $R^2$ can become negative in this situation, but I'm not sure about $R$. – student010101 Apr 11 '21 at 22:22
  • 1
    Yes, quite possible. You hear more about it for $R^2$ because it's *surprising* that $R^2$ can be negative – Thomas Lumley Apr 11 '21 at 23:30
  • @student010101 Of course some caution is merited, since in regressions without intercept, the [coefficient of determination "R-squared" isn't the square of R](https://stats.stackexchange.com/q/99669/22228)! For your new Q, try the following R code: `x – Silverfish Apr 11 '21 at 23:56
  • @student010101 The correlation between $y$ and $\hat y$ (fitted without an intercept term) will be negative if $x$ and $y$ have a negative correlation, but the regression slope is still positive so the sign-flipping doesn't occur. The graph in my example makes it clear why this can happen. Since this is slightly different question to the one you originally asked, you should consider asking it as a new question (though worth checking for duplicates first, in case it's been asked already) – Silverfish Apr 12 '21 at 00:09
  • @Silverfish Just to clarify, when you do involve an intercept, R-squared is ALWAYS the square of R right? – student010101 Apr 12 '21 at 01:04
  • @student010101 Yes, and [the link between $R$ and $R^2$ can be visualised geometrically](https://stats.stackexchange.com/a/130114/22228) provided there's an intercept included. Moreover, including an intercept ensures [there's no difference between $R^2$ and $r^2$](https://stats.stackexchange.com/q/134167/22228). Also worth pointing out that [if you *don't* include an intercept, your statistical software might do something different to the standard formula when calculating $R^2$](https://stats.stackexchange.com/q/26176/22228). – Silverfish Apr 12 '21 at 03:50