0

Is the following estimator $\hat{\rho}$ unbiased for $\rho$? $\hat{\rho}$ = $\frac{\frac{1}{n}\sum_{i=1}^n(Y_i - \bar{Y})(X_i - \bar{X})}{\sqrt{{\frac{1}{n}\sum_{i=1}^n(Y_i - \bar{Y})^2\frac{1}{n}(X_i - \bar{X})^2}}}$

$\rho$ is the Pearson's correlation coefficient.

So, firstly, we can reduce the expression by factoring out our n term. This gives us:

$\frac{(Y_i - \bar{Y})(X_i - \bar{X})}{\sqrt{{(Y_i - \bar{Y})^2(X_i - \bar{X})^2}}}$
Now, we can see that this fraction reduces to 1. Therefore, the estimator is unbiased. Is this a correct calculation? Did I mess up something in my utilization of the linearity of expectation?

AdamO
  • 52,330
  • 5
  • 104
  • 209
  • Are you missing another summand expression in the denominator for the SSXX term? You definitely do not want the expression to go to 1. That's biased, except if the correlation actually *is* one. – AdamO Mar 19 '18 at 15:08
  • Assuming that there is a summand missing in the denominator, here you can find an answer: https://stats.stackexchange.com/questions/220961/is-the-sample-correlation-coefficient-an-unbiased-estimator-of-the-population-co – Ale Mar 19 '18 at 16:45
  • @Alessandro I see the answer, evidently, but I'm having issues with deducting that. I did miss a summation term in the fraction. Is my work wrong somewhere in calculating the expected value? – Johnny G. Mar 19 '18 at 17:17
  • 1
    If you put the summation term back in, it will become clear (I hope). $\sqrt{\sum x_i^2y_i^2} \neq \sum x_i y_i$. You can't interchange the order of summing and taking the square root. – jbowman Mar 19 '18 at 17:21
  • @jbowman Right, but then how do I get from there to the answer Alessandro linked above? Confused about the mathematical proof to get there. – Johnny G. Mar 19 '18 at 17:53
  • A hint: If you knew the true means and used them instead of the sample means in the numerator, your estimator would be unbiased. Try writing that out, then using the trick of expanding the expression to include terms like: $(Y_i - \bar{Y} + \bar{Y} -\mu_Y)$ and work through the numerator some more. The fundamental issue is that $\bar{Y}$ and $\bar{X}$ are correlated if $Y$ and $X$ are, which causes a slight bias of $O(n^{-2})$, where one of the $n$s comes from dividing the sum by $n$ and the other from the fact that $\bar{Y}$... are averages, i.e., divided by $n$ themselves. – jbowman Mar 19 '18 at 18:04

0 Answers0