using caret and glmnet for variable selection

Question

I am using the caret and glmnet package for variable selection. I only want to find the best model and the coefficients and use them for a different model. Please help me understand the differences between caret and glmnet. Here's an example:

using state data:

data(state)
statedata <- data.frame(state.x77,row.names=state.abb,check.names=T)
statedata <- scale(statedata)

in caret:

library(caret)
fitControl <- trainControl(## 10-fold CV
  method = "repeatedcv",
  number = 5,
  repeats = 5)

lassoFit1 <- train(Life.Exp ~ . , data = statedata,
             method = "glmnet",
             trControl = fitControl)

I get alpha=1 and lambda=0.1 as best values. But how do I get the final best model? I tried

coef(lassoFit1$finalModel)

but I get a whole list of models.

coef(lassoFit1$bestTune)
coef(lassoFit1$bestTune$.lambda)

both return NULL.

Here's how I do it in glmnet:

library(glmnet)
cvfit <- cv.glmnet(x=statedata[,c(1:3,5,6,7,8)],y=statedata[,4])

When I type

coef(cvfit, s="lambda.min")

I get the coefficients. Is this the best model? It looks like glmnet does test more lambdas in the cross validation, is that true? Does caret or glmnet lead to a better model?

How do I manage to extract the best final model from caret and glmnet and plug it in a cox hazard model for example?

thanks

you may want to try `predictors(lassoFit1)`, which lists predictors used in the final model. — David Ren, Feb 12 '16 at 00:32

score 3 · Accepted Answer · answered Jun 07 '20 at 17:46

If you check the lambdas and your best lambda obtained from caret, you will see that it is not present in the model:

lassoFit1$bestTune$lambda
[1] 0.01545996
lassoFit1$bestTune$lambda %in% lassoFit1$finalModel$lambda
[1] FALSE

If you do:

coef(lassoFit1$finalModel,lassoFit1$bestTune$lambda)
8 x 1 sparse Matrix of class "dgCMatrix"
                        1
(Intercept) -4.532659e-15
Population   1.493984e-01
Income       .           
Illiteracy   .           
Murder      -7.929823e-01
HS.Grad      2.669362e-01
Frost       -1.979238e-01
Area         .

It will give you the values from the lambda it tested, that is closest to your best tune lambda. You can of course re-fit the model again with your specified lambda and alpha:

fit = glmnet(x=statedata[,c(1:3,5,6,7,8)],y=statedata[,4],
lambda=lassoFit1$bestTune$lambda,alpah=lassoFit1$bestTune$alpha)
> fit$beta
7 x 1 sparse Matrix of class "dgCMatrix"
                   s0
Population  0.1493747
Income      .        
Illiteracy  .        
Murder     -0.7929223
HS.Grad     0.2669745
Frost      -0.1979134
Area        .

Which you can see is close enough to the first approximation.

To answer your other questions:

I get the coefficients. Is this the best model?

You did coef(cvfit, s="lambda.min") which is the lambda with the least error. If you read the glmnet paper, they go with Breimen's 1SE rule (see this for a complete view), as it calls uses a less complicated model. You might want to consider using coef(cvfit, s="lambda.1se").

does test more lambdas in the cross validation, is that true? Does caret or glmnet lead to a better model?It looks like glmnet

by default cv.glmnet test a defined number of lambdas, in this example it is 67 but you can specify more by passing lambda=<your set of lambda to test>. You should get similar values using caret or cv.glmnet, but note that you cannot vary alpha with cv.glmnet()

How do I manage to extrage the best final model from caret and glmnet and plug it in a cox hazard model for example?

I guess you want to take the non-zero coefficients. and you can do this by

#exclude intercept
res = coef(cvfit, s="lambda.1se")[-1,]
names(res)[which(res!=0)]
[1] "Murder"  "HS.Grad"

using caret and glmnet for variable selection

1 Answers1