Getting started with regularization (Lasso)

Question

I've got a small data set of 55 observations with a binary outcome variable of which only 11 are 1's and the rest are 0's.

I was wondering if Lasso was a useful tool to predict my outcome here and if not I thought I'd still learn a thing or two.

I can get the model to run and display coefficients and p values typing:

dslogit outcomeY x1 x2 x3 x4...xn, controls(c1, c2, c3...cn)

It actually looks great, the p values are much better than I get with my highly unstable multiple regression (I realize multiple regression is not a great idea with such a little dataset) and knowing that when something's too good to be true, it usually is; I ask you: What's my mistake here and what should I be looking out for before I go tell everyone about my magnificent results?

(1) How did you calculate the p-values? - what null hypotheses are relevant to you? (2) What do you want from the model in any case? — Scortchi - Reinstate Monica, Jan 09 '20 at 01:17
I have no idea how I calculated the p values. Like I wrote, I'm getting started. The dslogit outputs p values and odds ratios! I have about 10 predictors. The null hypothesis is that nothing can predict the outcome so it's a bit of a far fetch. — Paze, Jan 09 '20 at 09:29
Correct calculation & interpretation of p-values for LASSO, or for penalized regression in general, is a thorny problem. [Here](https://stats.stackexchange.com/q/410173/17230) might be a good place to start reading, as well as, of course, the manual for `dslogit`. What I was getting at with my 2nd question is that there might not be any reason to care about p-values if the point of choosing LASSO was to improve out-of-sample predictive performance - & if it was, you ought to be estimating that. — Scortchi - Reinstate Monica, Jan 09 '20 at 16:34

Getting started with regularization (Lasso)

0 Answers0