0

Let's say I have 3 classes, and each sample can belong to any of those classes.

[
[1 0 0]
 [0 1 0]
 [0 0 1]
 [1 1 0]
 [1 0 1]
 [0 1 1]
 [1 1 1]
]

I set my output as Dense(3, activation="sigmoid"), and I compiled with optimizer="adam", loss="binary_crossentropy". I guet 0.05 for loss, and 0.98 for accuracy, according to Keras output.

I thought I would get only 1 or 0 for prediction if I use sigmoid and binary_crossentropy. However, model.predict(training-features) gave me values between 1 and 0.

Then I clipped the values at 0.5 like below and checked accuracy_score(training_labels, preds). The score dropped to 0.1.

preds[preds>=0.5] = 1
preds[preds<0.5] = 0

I'd appreciate if someone could give me some guidance on how I should approach this problem.

Thanks!

jl303
  • 101

1 Answers1

0

Okay, things you need to correct in your approach:

  1. If you have 3 labels/classes, you should one-hot encode your y_train.
  2. You probably should use loss=categorical_crossentropy in compile for more than 2 classes.
  3. Your final activation function should be a softmax and not a sigmoid. You are getting prediction values between 0 and 1, because that's what sigmoid does.

Now, if you take an argmax on your prediction output, you can see the class with the highest confidence score.

Anakin
  • 129
  • 4
  • Thanks. However, all the posts I read so far suggest, I should use Sigmoid for activation at the end and binary_crossentropy for loss when working with multiclass multilabel. Not true? https://www.depends-on-the-definition.com/guide-to-multi-label-classification-with-neural-networks/ https://stats.stackexchange.com/questions/207794/what-loss-function-for-multi-class-multi-label-classification-tasks-in-neural-n https://github.com/keras-team/keras/issues/10371 – jl303 Apr 21 '19 at 02:59
  • See this https://stackoverflow.com/questions/42081257/keras-binary-crossentropy-vs-categorical-crossentropy-performance – Anakin Apr 21 '19 at 15:23
  • 2
    Btw, you have a multiclass classification problem but not multilabel – Anakin Apr 21 '19 at 15:28
  • Also have a look at this post https://towardsdatascience.com/deep-learning-which-loss-and-activation-functions-should-i-use-ac02f1c56aa8 – Anakin Apr 21 '19 at 15:34
  • 1
    Thanks @Anakin. Isn't it multi class multi label When there are more than 2 classes, and when a sample can be belong to more than one class at the same time? Multiclass: "fruit can be either an apple or a pear but not both at the same time." Multilabel: "A text might be about any of religion, politics, finance or education at the same time or none of these." https://scikit-learn.org/stable/modules/multiclass.html – jl303 Apr 23 '19 at 00:56
  • Sorry, I misunderstood then. My bad. – Anakin Apr 23 '19 at 05:10