If I want to use lasso and regression tree respectively to generate the important variable lists. I'm wondering when will they generate lists with a huge difference? Under what signal structure?
-
I've reopened the question -- initially I've misread the question. Apologies! – Sycorax Jun 08 '20 at 12:56
1 Answers
It will depend on what feature engineering you do in advance -- eg, creating interaction terms to give to lasso.
Beyond that, I don't have an exhaustive answer but here are two scenarios where one will find importance that the other misses
Suppose $X$ and $Z$ are independent of each other and moderately correlated with $Y$, and that $XZ$ is very strongly predictive of $Y$. For example, if they are binary and their exclusive-or is a strong predictor. Lasso will completely miss the $XZ$ interaction effect unless it is given $XZ$ as a feature, but a tree can split on X and then Z or Z and then X. So the tree will think $X$ and $Z$ are more important than lasso does
Suppose $X$ and $Z$ are continuous and strongly correlated with each other, and that $X-Z$ is a strong predictor of $Y$ but $X$ and $Z$ separately are independent of $Y$ (eg $X$ is income and $Z$ is expenditure, and it's positive or negative net income that matters). Lasso will pick this up, but it will be hard for trees to see it because neither $Z$ nor $X$ is a good split. Lasso will think the $X$ and $Z$ are more important than the forest does.
Also, I think the asymptotic theory suggests that lasso will better identify important variables from a very large set of variables having no relationship to the outcome

- 21,784
- 1
- 22
- 73