Fisher information for $\rho$ in a bivariate normal distribution

Question

I have seen many times people using the Delta method in order to find the asymptotic distribution of $r$, the sample correlation coefficient, for bivariate normal data. This distribution is given by

$$\sqrt{n} \left( r-\rho \right) \xrightarrow{D} \mathcal{N} \left(0, \left(1-\rho^2\right)^2 \right)$$

and this is a well known result (I know of the z-transform but it's not necessary in this context). I understand the method but what I have been wondering is, why they do not do something simpler. By the invariance of the mles of the sample means and variances, it is easy to show that the sample correlation coefficient is in fact the mle for $\rho$. Now as this is a mle, under the regularity conditions, it should follow the asymptotic distribution of the mle, namely

$$\sqrt{n} \left(r - \rho \right)\xrightarrow{D} \mathcal{N} \left(0, I^{-1} (\rho) \right)$$

where $I(\rho)$ is the Fisher information for $\rho$. All that remains now is to find $I(\rho)$. Differentiating twice the log of the bivariate normal distribution with respect to $\rho$ and taking the negative expectation, I believe one arrives at

$$I(\rho) = \frac{1+\rho^2}{\left(1-\rho^2\right)^2}$$

which, assuming I have not made a mistake in the lengthy computation, is very different than the above asymptotic variance, at least for not so small $\rho$. I have even run a few simulations that show the delta method to be far superior in most cases. The smaller asymptotic variance is in line with what one would expect from the mle, however it turns out to be a very bad approximation.

It's not impossible that I have made a mistake somewhere, although I have checked again and again. If that is not the case then, is there a conceptual mistake in the above reasoning? I have looked at some famous books on inference and nowhere do they mention the Fisher Information for $\rho$, which I also find quite puzzling.

I would appreciate any insight. Thank you.

Have a look at this answer of mine, http://stats.stackexchange.com/a/111963/28746 I cannot say if it is relevant to your situation, because it is not clear from your question how are the means and variances treated. — Alecos Papadopoulos, Nov 16 '15 at 03:29

score 6 · Accepted Answer · edited Apr 13 '17 at 12:44

6

The OP clarified in a comment that he examines the standard bivariate normal distribution, with means and variances fixed to zero and unity correspondingly,

$$f(x,y) = \frac{1}{2 \pi \sqrt{1-\rho^2}} \exp\left\{-\frac{x^2 +y^2 -2\rho xy}{2(1-\rho^2)}\right\} $$

In turn, this makes the distribution a member of the curved exponential family, and, as I have shown in my answer to this post the maximum likelihood estimator for $\rho$ in such a case does not equal the sample correlation coefficient. Specifically the sample correlation coefficient is

$$\tilde r = \frac 1n\sum_{i=1}^nx_iy_i$$

Denoting $\hat \rho$ the mle for $\rho$ and $(1/n)\sum_{i=1}^n(x_i^2 +y_i^2) = (1/n)S_2$, to be the sum of the sample variances of $X$ and $Y$, we obtain

$$\hat \rho: \hat \rho^3 -\tilde r \hat \rho^2 + \big[(1/n)S_2-1\big]\hat \rho -\tilde r=0$$

$$\Rightarrow \hat \rho\Big(\hat \rho^2 -\tilde r \hat \rho + \big[(1/n)S_2-1\big] \Big) = \tilde r$$

Doing the algebra, it is not difficult to conclude that we will obtain $\hat \rho = \tilde r$ if,and only if, $(1/n)S_2 =2$, i.e. only if it so happens that the sum of sample variances equals the sum of true variances. So in general for finite samples,

$$\hat \rho \neq \tilde r$$

Both remain consistent, but this alone does not imply that the asymptotic distribution of the sample correlation coefficient will attain the Cramer-Rao bound, which is the one found by the OP. And it doesn't.

edited Apr 13 '17 at 12:44

Community

1

answered Nov 16 '15 at 19:20

Alecos Papadopoulos

52,923
5
131
241

But isn't $r$ the mle for $\rho$? The invariance of the mle is still at play right? – JohnK Nov 16 '15 at 19:35
@JohnK No it is not, this is exactly the issue here. The Method-of-Moments estimator and the mle for $\rho$ do not coincide. – Alecos Papadopoulos Nov 16 '15 at 19:44
Thanks for clearing that up. It seems to me that there is no closed form solution for the mle of $\rho$ but if one solves for it in an iterative way, the inverse Fisher information in that case is the asymptotic variance, right? Again this emphasizes the efficiency of the mle, compared to the mom estimators. – JohnK Nov 16 '15 at 19:50
@JohnK. Indeed there is no reason to use the horrible exact solution formulas for a cubic equation. And the mle will attain the inverse Fisher information asymptotically. – Alecos Papadopoulos Nov 16 '15 at 20:31
3

I found a useful reference on the topic, in case you want to take a look. If you have Theory of Point Estimation by Lehmann and Casella, 2nd edition, in your shelves you might find Example 6.5, pg 472 illuminating. Interestingly, it seems that if you also want to estimate the variances of the bivariate normal distribution then it is my first equation that determines the asymptotic variance of $\widehat{r}$. – JohnK Nov 17 '15 at 18:17
@JohnK This is a great reference, that shows how the asymptotic variance might or might not change, depending on which parameters may be considered a priori known. – Alecos Papadopoulos Nov 22 '15 at 19:19

Fisher information for $\rho$ in a bivariate normal distribution

1 Answers1

Linked