0

I am doing lasso regression to understand the influential variables from a lists of 65 odd variables that affect the liquor consumptions of an individual.

The independent variables are combination of categorical and numeric variable like State, Education, Sex, Age, income....

Glmnet package has been used and lambda is decided based on cross validation

  fit = glmnet(x, y, alpha = 1,lambda= 0.072,thresh = 1e-12)

The lasso has given list of 25 variables with non zero coefficient and rest all 0.

The Beta values are as below

  fit$beta

  State           -0.350
  Education       -0.254
  Age              0.175 
  Sex              .
  ...              ....

Education is a categorical variables with 5 levels - No school, High school, Graduate, Masters, Doctorate. Unlike linear regression which would give 4 beta estimates for each unique level and one will be used as reference in Lasso it gives only one Beta for Education. I am not able to interpret these beta for categorical variable(factor variable).

  • How to interpret those lasso coefficients and the signs
  • For numeric variable like Age is it to be interpreted same as in linear regression

I got some clue here Categorical variables in LASSO regression but not sure how to relate that with the beta that I got here.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
joy_1379
  • 203
  • 2
  • 8
  • 3
    `glmnet` is not able to handle categorical variables directly, you need to convert them to dummy variables as described [here](https://stats.stackexchange.com/a/210075) – drmaettu Oct 22 '21 at 08:30
  • Is there any other package/function that creates dummy automatically like lm or glm function does? – joy_1379 Oct 22 '21 at 10:26
  • 1
    hmm I've always used `glmnet` so I'm not aware of such a package. But converting to dummies is really easy, just write: `fit = glmnet(model.matrix( ~ . -1, x), y)`. What this does is it creates a design matrix without the intercept (hence the -1), which will be taken care of by `gmlnet`. – drmaettu Oct 22 '21 at 12:28

0 Answers0