I would like to know how to select the final variables in (cv.glmnet) LASSO by passing the lm command (similar to the link below) but for binary outcome with binary predictors.
Thanks a lot.
Link to obtaining p values of final variables in LASSO with continuous outcome: LASSO Regression - p-values and coefficients
data <- read.csv(file = 'X.csv')
train_df <- na.omit(data)
# Predictors
xfactors<-model.matrix(Outcome ~ ., data=train_df)[,-1]
xfactors
yfactors<- train_df$Outcome
yfactors
#create dummy variable matrix
x<-as.matrix(data.frame(xfactors))
x
CV = cv.glmnet(x, y=yfactors, alpha=1, family = "binomial",
type.measure = "class", nlambda = 100, nfolds = 10, intercept=TRUE)
CV = glmnet(x, y=yfactors, alpha = 1, family = "binomial",
lambda = CV$lambda.min)
W <- as.matrix(coef(CV))
W
keep_X <- rownames(W)[W!=0]
keep_X <- keep_X[!keep_X == "(Intercept)"]
x <- x[,keep_X]
summary(lm(yfactors~x))
#### Getting this error below
Call:
lm(formula = yfactors ~ x)
Residuals:
Error in quantile.default(resid) : (unordered) factors are not allowed
In addition: Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
3: In Ops.factor(r, 2) : ‘^’ not meaningful for factors