0

I have a statistics final coming up and I'm doing some practice questions. I'm unable to do this question and I was hoping somebody would be able to help me understand how to do it.

Question: A least squares regression of Y on X gives a slope of 2.7. Using the same data for a least-squares regression of X on Y gives a slope of 0.3. What is the value of the correlation coefficient?

Any help is appreciated,

Thanks

  • 2
    Please add the self-study tag and read its wiki. I'll give a hint. Let $y = \alpha + \beta x$ be your regression equation. Then $\beta = r_{XY}\dfrac{\sigma_Y}{\sigma_X}$. – Dave Apr 20 '20 at 19:15
  • 1
    See https://stats.stackexchange.com/questions/22718/what-is-the-difference-between-linear-regression-on-y-with-x-and-x-with-y and apply algebra. – whuber Apr 20 '20 at 19:15
  • 1
    The answer you accepted is not correct in saying that you need the means to find the correlation. From the information you gave, I was able to calculate a correlation of $0.9$. If you have an answer key, the listed solution is, I'm guessing, $0.9$. – Dave Apr 20 '20 at 20:00

1 Answers1

-3

Denote the two "slope"s $\beta_{yx}$ and $\beta_{xy}$, the sample correlation coefficient $\rho_{xy}$, then: $$ \begin{align} \beta_{yx} & = (x^Tx)^{-1}x^Ty \\\beta_{xy} &= (y^Ty)^{-1}y^Tx \\\rho_{xy} & = \left((x-m_x)^T(x-m_x)(y-m_y)^T(y-m_y)\right)^{-1/2}(x-m_x)^T(y-m_y) \\ \end{align} $$ Where $m_x$ and $m_y$ are the means of $x$ and $y$ respectively. If $m_x$ and $m_y$ are unknown, you can see from the equation that it's impossible to get $\rho_{xy}$ from $\beta_{xy}$ and $\beta_{yx}$.

But if $x$ and $y$ are all centred, i.e. $m_x=m_y=0$, then from above equation you can easily get: $$ \rho_{xy} = \sqrt{\beta_{xy}\beta_{yx}} $$

Haotian Chen
  • 653
  • 3
  • 8
  • 1
    -1 This appears to contradict my comment, and I used the equation in my comment, without knowing the means of either variable, to calculate a correlation of $0.9$, which is consistent with a simulations I've done. – Dave Apr 20 '20 at 19:58
  • Hi @Dave yes your comment is correct in it's own right, but it's not correct for this question. The question states that "regression Y on X", which means $y = \beta_{yx}x + \epsilon,\epsilon \sim N(0,\sigma^2)$, not $y=a+\beta_{yx}x+\epsilon, \epsilon\sim N(0,\sigma^2)$ as you have commented. Also, you are using quantities such as $\sigma_x$ and $\sigma_y$ which are not provided in the question, and you ignore to use $\beta_{xy}$ which is provided in the question. – Haotian Chen Apr 20 '20 at 20:10