3

I'm working with a very small sample size (N=14) and I'm using AICc to identify the most parsimonious model using a large number of possible predictors. Unexpectedly the best model has six predictors! I'd tend to believe that this is an over-fitted model, yet AICc should penalize for additional predictors and should be suited for samples smaller than N=40.

Should I set a maximum number of predictors at priori to consider in the model selection or should I trust AICc?

Many thanks

Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
Oritteropus
  • 355
  • 3
  • 8

1 Answers1

3

There is only one possible good reason for this - that the signal:noise ratio in your data is very high, i.e., the true $R^2$ is intrinsically high. But more likely you have used AIC to compare more than 3 possible models and you are just seeing noise. AIC is a restatement of $P$-values and as such has all the problems of $P$-value-guided stepwise variable selection. AIC just uses a better (i.e., larger) $\alpha$ cutoff than 0.05. In general if you have more then 3 or 4 pre-specified models to compare, AIC has low probability of selecting the "right" model.

On most any dataset (although yours may be too small) you can check all this by bootstrapping the entire variable selection process.

AIC is most helpful when doing highly structured assessment of a large group of parameters, e.g., "I have a 5-variable model and there are 7 other variables not thought in the literature to be relevant. Will I improve model performance by adding the 7?". Or "I have a linear additive model in 6 predictors. What is the value of expanding all of them into restricted cubic splines to allow for nonlinearity?".

Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
  • 1
    So what is the problem with AICc (or AIC)? Theoretically, it should be an "unbiased" (in a certain sense) selector of the "right" model from a pool of models, which is nice. But apparently there is something "wrong" with it as you are sceptical about it. What is the fault? Does it suffer from low precision (i.e. even though it is unbiased, it often selects something else than the "right" model)? Could you recommend any enlightening literature to put your critique into context? Also, what would you recommend instead (for the OP)? – Richard Hardy Jan 22 '15 at 12:56
  • 2
    @RichardHardy: See [Algorithms for automatic model selection](http://stats.stackexchange.com/questions/20836/). Any measure of a model's fit is a statistic, subject to random variation, & the more models you examine before picking the "best", the more optimistically biased is the apparent fit of that "best" model. A *procedure* that restricts your room to manoeuvre in the selection & fitting of models deserves to be called "parsimonious" much more than a model got by cherry-picking a single combination of predictors from a huge number. – Scortchi - Reinstate Monica Jan 22 '15 at 13:33
  • @Scortchi: Thanks for the link, that's a great source! However, it mainly addresses *stepwise* selection, while I care about the (critique of) AIC-based selection in general. Could you or Frank Harrell direct me to something more relevant? (Again, my guess is that AIC suffers from low precision as in my comment above.) – Richard Hardy Jan 22 '15 at 15:19
  • 3
    @RichardHardy: The arguments made there against stepwise selection apply *a fortiori* to best-subsets selection. Even keen proponents of the use of AIC in model selection such as Burnham & Anderson (2002), *Model Selection and Multi-model Inference: A Practical Information-Theoretic Approach*, recommend selecting the model with lowest AIC from among *a few*, theoretically well-founded, candidate models, & inveigh against data dredging. The issues aren't peculiar to the use of AIC. – Scortchi - Reinstate Monica Jan 22 '15 at 17:28