2

I am interested in finding out how confident my model (say Logistic Regression) is in predicting the label of a new data point. For example, if it is not confident, I better abstain from making a prediction.

Logistic Regression outputs probabilities, which gives you a notion that there is confidence in prediction. But in fact, it is not. $P(y|x)$ being 0.3, doesn't tell me that the model is confident in its prediction or not. All, we can say is that it believes 100% that class A is 0.3 and other class is 0.7.

Can we use confidence intervals of LR as some sort of confidence in prediction? E.g., larger the difference between upper and lower bound for a particular data point, less confident it is, and vice-versa?

If not, how can we build confidence in our prediction? Can anybody guide me to some paper or field of study?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
neo
  • 21
  • 2
  • 1
    You could use bootstrap samples to attain different values and calculate a CI based on these new outcomes. – Pedrinho Feb 21 '19 at 07:41
  • yes you can use confidence interval calculations for logistic regression, and for more complicated models you have to use bootstrap samples. see https://stats.stackexchange.com/a/354660/ – seanv507 Feb 21 '19 at 09:50
  • Seems to be answered here: https://stats.stackexchange.com/questions/29044/plotting-confidence-intervals-for-the-predicted-probabilities-from-a-logistic-re – kjetil b halvorsen Mar 19 '20 at 03:36

1 Answers1

0

Your inference about $y$ is the Bernoulli distribution $$ \mathrm{prob}(y = y^\prime | x = x^\prime, \mathcal{D}, \mathcal{I}) = \begin{cases} 0.3 & y^\prime = 1 \\ 0.7 & y^\prime = 0 \end{cases} $$ The probability $0.3$ is itself the measure of your confidence in $y = 1$. So you should not confidently predict that $y = 1$, nor should you even confidently predict that $y = 0$.

CarbonFlambe
  • 423
  • 2
  • 7