2

I have read that for null hypothesis $H_0 : \beta = \beta_0$, the likelihood ratio statistic from profile-likelihood is $$LR = 2 (\log L_p(\hat{\beta_p}) - \log L_p(\beta_0))$$ where $\hat{\beta_p}$ maximises the profiled likelihood $L_p(\beta) = \max_{\gamma} L(\beta, \gamma)$.

And that by assuming $LR$ has some distribution we can create a confidence interval for $\beta$. I've seen a lot of questions asking how to do this - I understand that - but I am confused as to why we can do this.

I.e. why (/when) can I assume LR has say a normal distribution given I know $\beta$ and $\gamma$'s distributions for example? I feel I am missing something obvious as I can't find any good resources on this.

Jarle Tufto
  • 7,989
  • 1
  • 20
  • 36
JDoe2
  • 576
  • 3
  • 14

2 Answers2

5

From the definition of the profile likelihood function it follows that $$ 2(\ln L_p(\hat\beta) - \ln L_p(\beta)) = 2(\ln L(\hat\beta,\hat\gamma)-\ln L(\beta,\hat\gamma_0)) \tag{1} $$ where $\beta$ denotes the unknown fixed value of the parameter of interest, $(\hat\beta,\hat\gamma)$ denotes the joint MLEs, and $\hat\gamma_0$ the MLE of $\gamma$ if we fix $\beta$.

By Wilks' theorem, the right hand side of (1) has an approximate or asymptotic chi-square distribution with 1 degree of freedom. We can thus use the left hand side as pivotal quantity, which leads to $$ P(2(\ln L_p(\hat\beta) - \ln L_p(\beta))\le \chi_{1,\alpha}^2)\approx 1-\alpha. $$ Rewriting the event inside the parenthesis we have $$ P(\ln L_p(\beta)\ge \ln L_p(\hat\beta) - \frac{\chi_{1,\alpha}^2}2)\approx 1-\alpha, $$ or $$ P(L \le \beta \le U)\approx 1-\alpha, $$ where the random variables $L$ and $U$ denote the values of $\beta$ for which the profile log likelihood is $\chi^2_{1,\alpha}/2$ smaller than the maximum profile log likelihood.

$(L,U)$ is therefore an approximate $(1-\alpha)$-confidence interval for $\beta$.

Jarle Tufto
  • 7,989
  • 1
  • 20
  • 36
  • Oh! I see, yes sorry, I misread a $Z^2$ as $Z$ in the resource I am using hence my confusion! Thank you for clearing this up for me! – JDoe2 Jan 29 '19 at 21:35
0

Well, this all has to do with the distribution of likelihood functions. It is a general result that asymptotically $-2LR\sim\chi^2(k)$, under the null hypothesis. Here, $k$ is the number of parameters that can be freely estimated, so the number of coefficients in $\hat{\beta}_p$. This result is sometimes called Wilks' Theorem, for more information consider:

https://en.wikipedia.org/wiki/Likelihood-ratio_test

To conclude: You can always create a confidence interval for LR, based on its asymptotic $\chi^2$ distribution. This means that you cannot assume that the LR has a normal distribution, as it is not true.

J. Dekker
  • 108
  • 1
  • 9