Questions tagged [aic]

AIC stands for the Akaike Information Criterion, which is one technique used to select the best model from a class of models using a penalized likelihood. A smaller AIC implies a better model.

$\mathit{AIC} = 2k - 2\ln(L)$

where $k$ is the number of parameters in the statistical model, and $L$ is the maximized value of the likelihood function for the estimated model.

${AICc} = AIC + \frac{2k(k + 1)}{n - k - 1}$

AICc is AIC with a correction for finite sample sizes, where $n$ denotes the sample size. Thus, AICc is AIC with a greater penalty for extra parameters.

AIC was introduced by Hirotugu Akaike in his seminal 1973 paper "Information Theory and an Extension of the Maximum Likelihood Principle" (in: B. N. Petrov and F. Csaki, eds., 2nd International Symposium on Information Theory, Akademia Kiado, Budapest, pp. 267{281).

References:

Wikipedia

"Information Theory and an Extension of the Maximum Likelihood Principle" (starts on page 610).

928 questions
265
votes
13 answers

Is there any reason to prefer the AIC or BIC over the other?

The AIC and BIC are both methods of assessing model fit penalized for the number of estimated parameters. As I understand it, BIC penalizes models more for free parameters than does AIC. Beyond a preference based on the stringency of the criteria,…
russellpierce
  • 17,079
  • 16
  • 67
  • 98
228
votes
8 answers

Algorithms for automatic model selection

I would like to implement an algorithm for automatic model selection. I am thinking of doing stepwise regression but anything will do (it has to be based on linear regressions though). My problem is that I am unable to find a methodology, or an…
S4M
  • 2,432
  • 3
  • 13
  • 6
86
votes
5 answers

What do the residuals in a logistic regression mean?

In answering this question John Christie suggested that the fit of logistic regression models should be assessed by evaluating the residuals. I'm familiar with how to interpret residuals in OLS, they are in the same scale as the DV and very clearly…
russellpierce
  • 17,079
  • 16
  • 67
  • 98
72
votes
7 answers

Do all interactions terms need their individual terms in regression model?

I am actually reviewing a manuscript where the authors compare 5-6 logit regression models with AIC. However, some of the models have interaction terms without including the individual covariate terms. Does it ever make sense to do this? For example…
djhocking
  • 1,701
  • 3
  • 17
  • 21
49
votes
3 answers

AIC,BIC,CIC,DIC,EIC,FIC,GIC,HIC,IIC --- Can I use them interchangeably?

On p. 34 of his PRNN Brian Ripley comments that "The AIC was named by Akaike (1974) as 'An Information Criterion' although it seems commonly believed that the A stands for Akaike". Indeed, when introducing the AIC statistic, Akaike (1974, p.719)…
Hibernating
  • 3,723
  • 2
  • 21
  • 34
46
votes
1 answer

Negative values for AIC in General Mixed Model

I'm trying to select the best model by the AIC in the General Mixed Model test. The best model is the model with the lowest AIC, but all my AIC's are negative! So is the biggest negative AIC the lowest value? Or is the smallest negative AIC the…
44
votes
5 answers

AIC guidelines in model selection

I typically use BIC as my understanding is that it values parsimony more strongly than does AIC. However, I have decided to use a more comprehensive approach now and would like to use AIC as well. I know that Raftery (1995) presented nice guidelines…
Tom Carpenter
  • 849
  • 2
  • 8
  • 13
43
votes
5 answers

Negative values for AICc (corrected Akaike Information Criterion)

I have calculated AIC and AICc to compare two general linear mixed models; The AICs are positive with model 1 having a lower AIC than model 2. However, the values for AICc are both negative (model 1 is still < model 2). Is it valid to use and…
Freya Harrison
  • 3,212
  • 4
  • 25
  • 31
39
votes
3 answers

Logistic Regression: Bernoulli vs. Binomial Response Variables

I want to perform logistic regression with the following binomial response and with $X_1$ and $X_2$ as my predictors. I can present the same data as Bernoulli responses in the following format. The logistic regression outputs for these 2 data sets…
38
votes
3 answers

What does the Akaike Information Criterion (AIC) score of a model mean?

I have seen some questions here about what it means in layman terms, but these are too layman for for my purpose here. I am trying to mathematically understand what does the AIC score mean. But at the same time, I don't want a rigor proof that…
caveman
  • 2,431
  • 1
  • 16
  • 32
36
votes
3 answers

Is it possible to calculate AIC and BIC for lasso regression models?

Is it possible to calculate AIC or BIC values for lasso regression models and other regularized models where parameters are only partially entering the equation. How does one determine the degrees of freedom? I'm using R to fit lasso regression…
Jota
  • 804
  • 1
  • 10
  • 21
35
votes
3 answers

Can AIC compare across different types of model?

I'm using AIC (Akaike's Information Criterion) to compare non-linear models in R. Is it valid to compare the AICs of different types of model? Specifically, I'm comparing a model fitted by glm versus a model with a random effect term fitted by glmer…
Thomas K
  • 453
  • 1
  • 4
  • 5
30
votes
3 answers

What is the difference in what AIC and c-statistic (AUC) actually measure for model fit?

Akaike Information Criterion (AIC) and the c-statistic (area under ROC curve) are two measures of model fit for logistic regression. I am having trouble explaining what is going on when the results of the two measures are not consistent. I guess…
timbp
  • 1,067
  • 1
  • 11
  • 17
30
votes
3 answers

Prerequisites for AIC model comparison

What are exactly the prerequisites, that need to be fulfilled for AIC model comparison to work? I just came around this question when I did comparison like this: > uu0 = lm(log(usili) ~ rok) > uu1 = lm(usili ~ rok) > AIC(uu0) [1] 3192.14 >…
Tomas
  • 5,735
  • 11
  • 52
  • 93
28
votes
3 answers

AIC versus cross validation in time series: the small sample case

I am interested in model selection in a time series setting. For concreteness, suppose I want to select an ARMA model from a pool of ARMA models with different lag orders. The ultimate intent is forecasting. Model selection can be done by cross…
Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
1
2 3
61 62