3

I have a model with AIC equal to 78809. Does this mean this is a very bad model or the intepretation should be different? There are 15 variables, 2-level response variable and 40000 rows.

step() function from R statistical package returns almost the same AIC: 78600. Is there even a sense of applying step() function in that case?

mkt
  • 11,770
  • 9
  • 51
  • 125
Marcin Kosiński
  • 819
  • 3
  • 12
  • 25
  • 7
    You compare AIC metrics of different models. The absolute value has little meaning. – Aksakal Dec 18 '14 at 15:29
  • 1
    It is questionable if `step` does any good: https://stats.stackexchange.com/questions/20836/algorithms-for-automatic-model-selection – Tim Aug 08 '18 at 07:01

2 Answers2

6

This is from the description of AIC:

The Akaike information criterion (AIC) is a measure of the relative quality of a statistical model for a given set of data. As such, AIC provides a means for model selection.

I don't pay attention to the absolute value of AIC. I only use it to compare in-sample fit of the candidate models. Note, that if you're building the forecasting models, it is important to also consider out-of-sample fit.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • 6
    +1. You can't interpret the absolute value of AIC. (Different software packages may well give completely different AICs on the same data for the same model.) What you *can* interpret is the difference in AIC between different models applied to the same data. Burnham & Anderson give some rough rules of thumb: a difference of 2 means that both models are essentially equally good, 5 means that the model with the lower AIC is a bit better, 10 is pretty strong evidence that the lower AIC model is better. – Stephan Kolassa Dec 18 '14 at 15:43
1

As others said, there is not much point in evaluating a single model according to the absolute value of its AIC.

The point is to compare the AIC values of different models and the model which has lower AIC value than the other is better than the other in the sense that it is less complex but still a good fit for the data.

In no way I mean that ONLY less complex model = lower AIC. I am saying "less complex but still a good fit for the data". Obviously, a more complex problem may be preferable if your model is underfitting so obviously it is not necessary that a less complex model is better or has a lower AIC but in general a less complex problem which is not underfitting is better than a more complex one.

Outcast
  • 125
  • 8
  • 3
    (-1) You probably do not mean this, but your answer implies that less complex models = lower AIC and better. But it is not necessary that a less complex model is better or has a lower AIC. Sometimes you need a more complex model! – mkt Aug 07 '18 at 12:25
  • In no serious way you can infer from my post that ONLY less complex model = lower AIC. At my post I am saying "less complex but still a good fit for the data". Obviously, a more complex problem may be preferable if your model is underfitting so obviosly as you are saying "it is not necessary that a less complex model is better or has a lower AIC" but in general a less complex problem which is not underfitting is better than a more complex one. – Outcast Aug 07 '18 at 12:34
  • 4
    I think it's a reasonable interpretation of your words, though I understand that it's not exactly what you mean. I hope you will edit your answer to make your meaning clearer than it presently is, in both answer and comment (I'll gladly remove my downvote when that happens). – mkt Aug 07 '18 at 12:38
  • 4
    @mkt is not alone -- I would have read your original answer exactly as mkt did. Had I seen the answer as it stood at that point, I would also likely have downvoted for the same reason. – Glen_b Aug 07 '18 at 14:48