1

I am working in R creating a GBM model using H2O and trying to use LIME to look at some local explanations to get a feel for what the model is doing. It's a binary classifier and I'm specifying 8 for n_features to the LIME package. However, I keep running into situations where all or most the 8 features are showing as contradicting the highest probability class. The funny thing is the predicted probability of the class is in the 90's.

How would one interpret this? Is there a problem in the LIME package implementation?

Here are a couple of examples: enter image description here

enter image description here

Dave Kincaid
  • 1,458
  • 1
  • 12
  • 18
  • If you reverse those inequalities, which are standardized to be of the $\leq$ or $ – jbowman Dec 18 '17 at 23:37
  • 1
    I'm not sure I'm following you. If I look at Case 8 for example. The predicted class is FALSE with probability 0.945. I see that for this case the feature "visitTotalCost" is <= 24.6 and this fact contradicts a prediction of FALSE and would instead support a prediction of TRUE. So what is causing the probability of FALSE to be so high? – Dave Kincaid Dec 19 '17 at 00:05
  • 1
    You introduced a new tag [lime], can you please add a tag wiki? – kjetil b halvorsen Dec 19 '17 at 10:38
  • 2
    Can you please provide additional information and potentially some reproducible code? Please note that LIME makes the rather significant assumption that a linear model can use *in terms of explanatory value* on a local scale. If your model is highly non-linear (almost) all bets are off. To that extent if you fit a simple GLM are the GLM estimates and the GLM-LIME estimates reasonable? (They should coincide actually) Finally you do not mention what you `TRUE/FALSE` training ratio. That might be an issue too. – usεr11852 Jan 04 '18 at 20:01
  • 1
    Is this why all the examples, tutorials and talks I see about LIME only use very simple data/models and really only talk about the examples from the paper? Is it not really usable in the real world? – Dave Kincaid Jan 05 '18 at 02:52
  • 1
    @Dan I do not think that is the case but that being said, as it is a very new methodology and people have not been fully accustomed to it, expositions of it tend to be somewhat basic. Please provide some of the information I requested but as it stands your question is not providing all the information needed unfortunately. – usεr11852 Jan 05 '18 at 12:08

1 Answers1

0

Drawing on jbowman's comment, if you look at the values for the predictors in the example cases do they satisfy the conditions on the left? For example, in Case 8, if visitTotalCount is actually >24.6 then it will be having a positive effect on the predicting on the model.

SlyFox
  • 1
  • No, the value is <= 24.6 that is how the label is created, so it contradicts the prediction of FALSE. And the top 8 features as shown in the graph all contradict the prediction. That is what has be confused. – Dave Kincaid Jan 04 '18 at 15:23