I have a sparse regression problem (Sparse because a few inputs are factors so we have a lot of columns of 1s and 0s). I am thinking of Ridge Regression because of the sparsity, but also because a lot of the terms will have interaction effects. I also want an interpretable model.
Is there a way to use the Ridge Penalty for a Linear Regression Classifier? If not, is there any base learner which allows me to include interaction effects and still result in a sparse solution.