4

In different sources there is an algorithm how to calculate the variance of MLE in R. To keep it short:

  1. construct the negative log likelihood function.

  2. minimize it via nlm or optim with hessian=TRUE

  3. invert the Hessian and read out the diagonal components.

I have simulated 1000 Bernoulli random variables and did this algorithm. Then I took the closed formula for MLE and for its variance. Values were almost identical, so yes, this works. But there is a problem, when I come from theoretical point of view. According to asymptotic normality we have \begin{align*} \sqrt{n}(\hat{\theta}_n - \theta_0) \overset{d}{\rightarrow} \mathcal{N}(0,\mathcal{I}^{-1}(\theta_0)) \end{align*} and there for $n$ large enough \begin{align*} \hat{\theta}_n \sim \mathcal{N}(\theta_0,\frac{\mathcal{I}^{-1}(\theta_0)}{n}) \end{align*} I ask myself, why the Hessian matrix after it's inversion isn't divided by $n$ (the number of observations)? If I got it correct, then $\mathcal{H}^{-1}(\hat{\theta})$ is only the approximation for $\mathcal{I}^{-1}(\theta_0)$.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
holic
  • 61
  • 3

1 Answers1

1

The Fisher information you present in the first formula is for a single observation.

The Hessian is for a sum of $n$ terms.

Consequently the Hessian and the corresponding information for the entire sample incorporate $n$ terms - when you invert it effectively has a $\frac{1}{n}$ in it, multiplied by the inverse of the "average" of those $n$ terms.

Glen_b
  • 257,508
  • 32
  • 553
  • 939