How to handle inconsistency in ML explanations?

Question

I found out that we have different solutions like below to explain the ML predictions

a) LIME

b) SHAP

Despite using all these approaches, I see that all of them work for certain data points and not work for certain data points. for example, let's assume we have a dataset with 6 re ords. LIME works/explanation makes sense for records 1,2,3 and 4 whereas LIME explanation doesn't make sense for records 5 an 6.

On the other hand, SHAP works well for 3,4,5 and 6 but doesn't explain well for records 1 and 2.

So, my question is below

machine learning model are not 100% explainable? How do you deal such scenarios when you wish to explain your predictions to the business users but you find out that local explanations make sense for few records and not for other records? Do you compromise on explanations/ go live even with some inconsistency in explanations?

Is it an expected behavior? Simply, what do you amswer when ypur business asks why certain explanations are inaccurate (and some are correct). How do we rely on such a solution?

You say "solutions like below" and mention specific records but there is no data attached to your question. Is that intentional? — g g, Feb 17 '22 at 10:11
No, (nontrivial) machine learning models are not 100% explainable, not even closely. — frank, Feb 17 '22 at 10:56
@gg - by solutions, I meant LIME amd SHAP. My question is on incosistency between different explainablw solutions amd within same solution. Meanimg, I would like to have one stable approach which can work well for all rows — The Great, Feb 17 '22 at 11:59
@frank - when your business stakeholders, why certaim explanatioms are not accurate, what do we say? Wouldn't it reflect bad on the data scientist? — The Great, Feb 17 '22 at 12:00
You usually cannot create a model that perfectly describes your data. Data scientists always try to strike the balance between the accuracy of the results and expenditure. Also, what do you think explainability of AI actually means? Do you explain noise? — frank, Feb 17 '22 at 12:08
not explain noise. What I mean, the algo has identified the outcome classes correctly as per actual labels. But when we look at the explanation for these records, they are off. — The Great, Feb 17 '22 at 12:10
Since, am looking for local explanation, I tried both SHAP amd LIME. Based on their results, if I build up a story to explain to few selected data points, they work well...but the same logic, doesn't work for all records..making biz lose confidence in such solutions — The Great, Feb 17 '22 at 12:13
Maybe your model gives the right prediction for the wrong reasons? I.e. the explanations may actually capture "why" the model predicts what it predicts, the humans looking at the explanation just don't like the explanation. Perhaps the model is exploiting some correlation that is not intuitive to humans or not (whether that correlation would truly hold for new data or not)? — Björn, Feb 17 '22 at 12:16
@Björn - It is not about not liking the predictions. It is about predictions not making sense/meaning. For ex: When it rains heavily, it makes sense that the cricket match didnt happen (due to bad weather)...this makes sense. I also see another explanation where it says the match didnt happen because it was normal weather. Just a cooked example. This sort of explanations doesn't make sense. Again, this sort of behavior is seen not for all records bit for definitely 30% of the records — The Great, Feb 17 '22 at 12:22
It could be that the explanation does not fully capture "what the model does" or it maybe highlights that the model "identified" a pattern that does not make sense to humans. If it's the latter, it's important to understand why. All too often, it's not a case of models discovering some amazing new insights ("Wow, who knew? Good weather causes sports events to be cancelled!"), but rather may point to some issues with the data selection, the data itself and/or the data processing. E.g. weather was only recorded when events actually took place, while missing values were imputed as "good weather". — Björn, Feb 17 '22 at 20:15
Hi @TheGreat. Question, is there a quantitative way to see the confidence of LIME or SHAP? Ideally, you could run both and pick the more confident answer. But this only works in an automated measure if you have a measure of confidence / some metric by which to pick the one to show. — shf8888, Feb 19 '22 at 15:06
@shf8888 - Both LIME and SHAP gives relative importance of each feature towards the output. there is no confidence level of each explanation. — The Great, Feb 20 '22 at 06:24
@theGreat, when you get explanations that don't make sense, is the relative / absolute importance of the most important feature relatively low? e.g., is the LIME/SHAP relative importance of the most important feature negatively correlated with quality of the explanation? I'm wondering if you could pick LIME vs. SHAP by selecting whichever one had the most relative importance for the first most important feature. — shf8888, Feb 22 '22 at 15:47

How to handle inconsistency in ML explanations?

0 Answers0

Linked