Shapley value regression / driver analysis with binary dependent variable

Question

I've done some driver importance analyses with the relaimpo package in R. However, the "normal" Shapley value regressions/driver analyses/Kruskal analyses (whatever you want to name them) require a metric dependent variable, because it's an approach for linear regressions.

I have a new dataset, where I have a dependent variable with two values (0/1) and want to assess the relative importance of 10 metric independent variables.

Is anyone aware of an approach to do such a driver analysis with a binary dependent variable or knows a different approach to assess the relative importances?

Thanks.

score 1 · Answer 1 · answered Oct 06 '16 at 17:24

1

you can do logistic regression/ or random forest classification, and analyze the important variables. in R you have importance() function that gives you the relative importance of the variables in .

answered Oct 06 '16 at 17:24

yosemite_k

115
3

The concept of importance in Shapley regression is very different to that in a Random Forest (a Random Forest will find fewer variables as being more important, all else being equal). And, the ``importance`` function you refer to is not shipped in ``base`` R. – Tim Mar 01 '17 at 03:13
can you explain more, or add some supporting reference? – yosemite_k Mar 02 '17 at 13:53
Sure. The references are in the answer below. Feel free to up-vote after you have read the reference. – Tim Mar 06 '17 at 03:56

score 1 · Answer 2 · answered Mar 01 '17 at 03:11

Relative Importance Analysis gives essentially the same results as Shapley (but not ask Kruskal). A variant of Relative Importance Analysis has been developed for binary dependent variables.

However, binary variables are arguable numeric, and I'd be shocked if you got a meaningfully different result from using a standard Shapley regression with your data.

Shapley value regression / driver analysis with binary dependent variable

2 Answers2

Linked