7

I am using glmnet for LASSO. My data set contains several continuous variables and one categorical variable (it has four levels). I wondered if I could treat three dummy variables as other continuous variables. Should I use a type of group LASSO approach for the three dummies?

Steffen Moritz
  • 1,564
  • 2
  • 15
  • 22
Jenny
  • 91
  • 3
  • 3
    Normally, yes, you keep your factors all together. There's several R packages that can do this, including `glmnet`. – Glen_b Sep 10 '14 at 00:49
  • @Glen_b What are the options in `glmnet` for running group lasso with categorical variables? I don't see anything about categorical variables at https://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html or https://cran.r-project.org/web/packages/glmnet/glmnet.pdf – Adrian Feb 22 '17 at 16:57
  • See the `type.multinomial` argument to `glmnet` – Glen_b Feb 22 '17 at 22:54
  • perhaps leave dummies not penalized (as what you with intercept), unless you have a good reason to put a constraint on them. if so, just add dummies and impute zero penalty weights if you use glmnet implementation. – Jonas Striaukas Mar 25 '20 at 23:43

1 Answers1

1

As far as I am aware glmnet doesn't have this feature implemented yet. @Glen_b's suggestion of using type.multinomial is used to group variables across all responses in a multinomial model, but there's no way of grouping independent variables in a model. see

https://cran.r-project.org/web/packages/grplasso/grplasso.pdf

for an alternative.

nolanp2
  • 11
  • 1