1

Obviously the easiest way to get the AIC of a linear model is by using the lm function and then the AIC command. However, this simple method can be very time consuming, as actually with the lm function R computes a lot of stuff which isn't strictly needed to calculate the AIC of the model. When this is done for thousands of different models that time quickly adds up.

As I am just interested in the AIC and don't care about anything else in the model, is there any easy way of getting just that without having to calculate the full lm object?

I was thinking about using lm.fit (which takes only 1/10 of the time of lm) - or even better solve(crossprod(X), t(X) %*% y) - to get the coefficients but then I am not really sure on how to get the loglikelihood function in a general way. Do you have any ideas or maybe some different approaches on how to get the AIC?

Many thanks in advance

Gisalbur
  • 11
  • 3
  • The quantity of "Information" needs to be estimated from the model. You cannot avoid to calculate it. – have fun Sep 16 '17 at 15:57
  • Well, I understand that I need something from the model, but do I really need all of it? Using `lm` gives me just so much useless stuff, while the formula for the information just requires the number of parameters (which I have without calculating the model), the MLE (which you get easily by the methods described above) and the loglikelihood, which is just one of the long list of stuff that calculating the model gives me. I am looking for a way to avoid computing the stuff which isn't strictly necessary. Do you really think that this is not possible? Thanks anyway for the answer – Gisalbur Sep 16 '17 at 17:18

1 Answers1

2

The AIC is $-2\log \hat{\mathcal{L}}+2p$, where $\hat{\mathcal{L}}$ is the likelihood, $\mathcal{L}$ evaluated at its maximum.

In a regression model if you replace the parameter estimates by their MLEs,

\begin{eqnarray} \hat{\mathcal{L}}&=&(2\pi\hat{\sigma}^2)^{-n/2} \exp(-\frac{_1}{^{2\hat{\sigma}^2}}\sum_i (y_i-\hat{y}_i)^2)\\ &=&(2\pi\hat{\sigma}^2)^{-n/2} \exp(-\frac{_1}{^{2\hat{\sigma}^2}} n\hat{\sigma}^2)\\ &=&(2\pi\hat{\sigma}^2)^{-n/2} \exp(-\frac{_n}{^{2}}) \end{eqnarray}

So

\begin{eqnarray} -2\log(\hat{\mathcal{L}})&=&-\frac{n}{2} \log(2\pi e)-\frac{n}{2}\log(\hat{\sigma}^2) \end{eqnarray}

The $p$ in the "$+2p$" to get the AIC would count every parameter you maximized over in calculating the likelihood (i.e. the terms in the vector $\beta$ - including the intercept - and $\sigma^2$).

[It's common to drop constants like $-\frac{n}{2} \log(2\pi e)$ in calculating likelihood with continuous distributions (perhaps even more so when proceeding to AIC). This makes no difference to a comparison of AICs, unless you're comparing with likelihoods that don't drop the same constant.]

Note then (dropping constants or not) that to calculate AIC you need $\hat{\sigma}^2$, which is $\frac{1}{n}SSE$ (where SSE is the sum of squared residuals; note the $n$ divisor). There are various convenient ways to calculate that depending on which calculations you ultimately carry out.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Should you not remove the -1/2-factors on the rhs in the last display? See also your answer at https://stats.stackexchange.com/questions/87345/calculating-aic-by-hand-in-r – Christoph Hanck Mar 08 '19 at 09:11