To fit a lasso model using glmnet
, you can simply do the following and glmnet
will automatically calculate a reasonable range of lambda
values appropriate for the data set:
glmnet(x, y, alpha = 1)
I know I can also do cross validation natively using glmnet
. However, I would like to use the caret
package so I can train and compare multiple models in a unified fashion. The problem is I don't know how to fix only the alpha
parameter when calling glmnet
from caret
because caret
wants to tune over both alpha
and lambda
.
train(y ~ ., data = train, method = 'glmnet', trControl = ctrl, tuneGrid = data.frame(alpha = 1))
Error in train.default(x, y, weights = w, ...) :
The tuning parameter grid should have columns alpha, lambda
Is there any way in general to specify only one parameter and allow the underlying algorithms to take care of the unspecified parameters according to the default methods of the algorithms?
EDIT:
The reason I would like to let glmnet
to choose the lambda
is that the appropriate range for lambda
can vary a lot and glmnet
does a good job of choosing a good range:
library(ElemStatLearn)
library(glmnet)
dat <- prostate
train <- subset(dat, train, select = -train)
train.x <- as.matrix(subset(train, select = -lpsa))
train.y <- train$lpsa
print(glmnet(train.x, train.y, alpha = 0)$lambda)
print(glmnet(train.x, train.y, alpha = 1)$lambda)
You can see that the ranges of lambda
are very different. To use custom model in train
, I thought about running glmnet
first and then assign the calculated lambda
to the grid:
glmnetGrid <- function(x, y, len = NULL) {
library(glmnet)
lam <- glmnet(x, y, alpha)$lambda
expand.grid(lambda = lam)
}
But then I need to pass alpha
to the grid function and I don't think the framework allows that. Also it seems inefficient to run glmnet
one more time just to get the lambda
. Is there a way to simply let glmnet
run and assign its lambda
to the grid so caret
can use the lambda
values in subsequent analysis?