Why does keras binary_crossentropy loss function return wrong values?

Question

Binary cross entropy for multi-label classification can be defined by the following loss function:

$$-\frac{1}{N}\sum_{i=1}^N [y_i \log(\hat{y}_i)+(1-y_i) \log(1-\hat{y}_i)]$$

Why does keras binary_crossentropy loss function return different values? What is formula bellow them? I tried to read source code but it's not easy to understand.

Updated

The code that gives approximately the same result like Keras:

import keras.backend as K
def binary_crossentropy(y_true, y_pred):
    result = []
    for i in range(len(y_pred)):
        y_pred[i] = [max(min(x, 1 - K.epsilon()), K.epsilon()) for x in y_pred[i]]
        result.append(-np.mean([y_true[i][j] * math.log(y_pred[i][j]) + (1 - y_true[i][j]) * math.log(1 - y_pred[i][j]) for j in range(len(y_pred[i]))]))
    return np.mean(result)

Would you be able to provide us some example code, and the value you expected to see? — datddd, Sep 14 '17 at 22:22

Siong Thye Goh · Accepted Answer · 2017-09-15T05:16:13.127

10

A mistake in your code:

$$-\frac{1}{N}\sum_{i=1}^N [\color{red}{\hat{y}_i} \log(\hat{y}_i)+(1-y_i) \log(1-\hat{y}_i)]$$

It should be

$$-\frac{1}{N}\sum_{i=1}^N [\color{blue}{y_i} \log(\hat{y}_i)+(1-y_i) \log(1-\hat{y}_i)]$$

Your code:

result.append([y_pred[i][j] * math.log(y_pred[i][j]) + (1 - y_true[i][j]) * math.log(1 - y_pred[i][j]) for j in range(len(y_pred[i]))])

should be changed to

result.append([y_true[i][j] * math.log(y_pred[i][j]) + (1 - y_true[i][j]) * math.log(1 - y_pred[i][j]) for j in range(len(y_pred[i]))])

where I have change your first y_pred to y_true.

Edit: Also from keras documentation, we have

binary_crossentropy(y_true, y_pred)

rather than

binary_crossentropy(y_pred, y_true)

edited Sep 15 '17 at 05:16

answered Sep 15 '17 at 02:51

Siong Thye Goh

6,431
3
17
28

Thanks, I fixed the error, but values are still different! The question remains. – Dmitry Sep 15 '17 at 04:48
1

Have you tried different base? base $e$, base $2$, base $10$? – Siong Thye Goh Sep 15 '17 at 04:50
1

Can you try switching the order of y_true and y_pred in binary_crossentropy and see if it works? – Siong Thye Goh Sep 15 '17 at 05:17
another potential bug, in your own code, we average over the whole array, in your earlier code, an axis is specified. – Siong Thye Goh Sep 15 '17 at 06:08
1

I've found the correct code. Thanks a lot for your help! I am ashamed of the large number of errors in my code. – Dmitry Sep 15 '17 at 06:11
congrats for finding the error. =) – Siong Thye Goh Sep 15 '17 at 06:11

Why does keras binary_crossentropy loss function return wrong values?

Updated

1 Answers1