AIC is -infinity for this model, so 'stepAIC' cannot proceed

Question

I am doing stepwise regression as following

fit1 = lm (y_train ~ ., data = dat)
step = stepAIC(fit1, direction = "forward")
Error in stepAIC(fit1, direction = "forward") : 
  AIC is -infinity for this model, so 'stepAIC' cannot proceed

> length(y_train)
[1] 132
> dim(x_train)
[1]  132 1501

I searched on google, but it does not solve the problem.

You should be happy about it since it prohibits you from using model selection method that leads to bad, overfitted models: http://stats.stackexchange.com/questions/20836/algorithms-for-automatic-model-selection/20856#20856 — Tim, Oct 20 '16 at 17:44
Since $\text{AIC} = - 2 \log [ \mathcal{L} (\hat{\theta}) ] + 2p$ it would seem that $\mathcal{L} (\hat{\theta}) = \infty$ which suggests a perfect fit. Also, why are you beginning with all the variables and doing forward selection? — dsaxton, Oct 20 '16 at 18:57
I also just noticed you have 1,500 variables and only 130 observations, which is clearly the reason for the error. — dsaxton, Oct 20 '16 at 22:07
You have way more parameters than observations so you have a perfect fit. You need to start with far fewer variables, and also "forward" does not make sense when you already have a fully saturated model. — dsaxton, Oct 21 '16 at 03:37

score 2 · Answer 1 · answered Jul 06 '17 at 13:16

The negative infinity in AIC infers very overfitted model in the model selection. It is fortunate that your stepAIC was stopped. A working demo below.

Validate the quality of your original data. In model selection such as Forward Stepwise, we have a special condition called breakdown point which is needed to ensure the quality of the model.

This thesis find that when p > n, there is a breakdown point for standard model selection schemes, such that model selection only works well below a certain critical complexity level depending on n/p. This notion is applied to some standard model selection algorithms (Classical Forward Stepwise, Forward Stepwise with False Discovery Rate thresholding, Lasso, LARS, and Stagewise Orthogonal Pursuit) in the case where p n. (Source)

which on about the page 58 explains more about how things such as sparsity and noise level affect the model selection breakdown ponit.

The model selection breakdown point is worrying with 1500 variables and 300 observations. As mentioned by dsaxton

I also just noticed you have 1,500 variables and only 130 observations, which is clearly the reason for the error..

which here means that your $n/p=\frac{1500}{130}=11.53...$ that is far from a thumb rule such as at least 2 times more observations than variables.

Example about a working demo

> library(MASS)
> dat<-USJudgeRatings[,1]; 
> y_train<-USJudgeRatings[,2]; 
> fit1 = lm (y_train ~ dat)
> fit1

Call:
lm(formula = y_train ~ dat)

Coefficients:
(Intercept)          dat  
      8.832       -0.109  

> stepAIC(fit1, direction="forward")
Start:  AIC=-20.24
y_train ~ dat


Call:
lm(formula = y_train ~ dat)

Coefficients:
(Intercept)          dat  
      8.832       -0.109

Related questions

What to do if number of features is much larger than number of observations?
Stanford Doctor Thesis about model selection and finding the breakdown in the case of some simple models such as regressions here

AIC is -infinity for this model, so 'stepAIC' cannot proceed

1 Answers1