Calculate binomial deviance (binomial log-likelihood) in the test dataset

Question

I'm predicting probabilities $\mathbb{P}(Y=1)$ using a probability forest (ranger in R). I want to evaluate my predictions $\hat p_i$ in a test dataset by calculating average binomial deviance (log-likelihood). I believe the formula is: \begin{equation} \text{mean deviance} = \sum_{i\in \text{testset}} -2\big[Y_i\ln \hat p_i + (1-Y_i)\ln(1- \hat p_i)\big] \end{equation} How do I deal with the fact that I have forest predictions that are excactly 0 or 1? For these observations, deviance is not defined due to the logarithm. Should I just omit them? Or should I set these values to, say, 0.00001 and 0.99999 respectively?

My aim is to estimate probabilities, not classify observations. Specificity and sensitivity are used in the latter case. — user116514, Oct 14 '18 at 08:02
P(Y=1|X=x). In my case, Y is a binary indicator that patients are part of a pharmaceutical cost group (1=yes, 0=no). X is a vector of predictors. — user116514, Oct 14 '18 at 18:22
you misunderstand what i'm doing. My estimator is a random probability forest. My evaluation metric in the testset is the binomial deviance. — user116514, Oct 16 '18 at 16:15

Ben · Answer 1 · 2021-12-19T05:13:25.597

I recommend against fudging these prediction values. The appropriate outcome here is that if the model predicts a thing with probability 1, and that thing doesn't happen, then its deviance is infinite. Similarly, if the model predicts a thing with probability 0, and that thing happens, then its deviance is infinite. That is the price you pay for making such strong predictions on an outcome and getting them wrong.

To achieve this as the outcome, you just have to deal with the ambiguity in terms of the form $0 \times -\infty$. Here you would adopt the convention that $\ln 0 = \infty$ and $0 \times -\infty = 0$, giving you:

$$\begin{align} Y_i \ln \hat{p}_i &= \begin{cases} 0 & & & & & \ \text{if } Y_i = 0, \\[6pt] \ln \hat{p}_i & & & & & \ \text{if } Y_i = 1. \\[6pt] \end{cases} \\[18pt] (1-Y_i) \ln (1-\hat{p}_i) &= \begin{cases} \ln (1-\hat{p}_i) & & & \text{if } Y_i = 0, \\[6pt] 0 & & & \text{if } Y_i = 1. \\[6pt] \end{cases} \end{align}$$

score 0 · Answer 2 · answered Feb 28 '20 at 11:41

0

You can clip the probabilities to guarantee that will they will never be 0 or 1. For example as per sklearn docs, set up a small value named eps and use max(eps, min(1 - eps, p) where p is the classifier's probability. sklearn docs for logloss

answered Feb 28 '20 at 11:41

mchl_k

1
1

Calculate binomial deviance (binomial log-likelihood) in the test dataset

2 Answers2