Rounding output of sigmoid for binary linear classifier

Question

I am working on a linear classifier with expected output to be 1 for class A belonging and 0 for class B belonging.

The output, in some occasions is

nearly 0 (0.000198752053929624), or
nearly 1 (0.999740100963010).

I've decided to round the numbers, and accuracy is 100%.

My question is: is this an acceptable procedure or there's some underlying problem that results in these outputs instead of clear 0s and 1s?

This problem happens for learning rates smaller than 0.1 on a gradient descent.

Closely related: https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models — Sycorax, Jan 03 '21 at 18:30

score 1 · Accepted Answer · answered Jan 04 '21 at 00:37

When you expect $0$ or $1$ as result, then there are two approaches: 1. rounding ($> 0.5$ is class 1, $\leq 0.5$ is class 0), 2. looking at the receiver operating characteristic curve (ROC) or the precision-recall (PR) curve to find a threshold. The second approach is better because it is possible to find the best threshold for the dataset at hand.

When you want the output probabilities to indicate confidence, then you should look at metrics such as expected calibration error or calibration curves. Then ideally low probabilities would mean high confidence for class 0, and high probabilities would mean high confidence for class 1.

As was noted in the comments, accuracy is not the best metric for assessing how good a classifier is. I would compute precision to be sure that the results are good. Your classifier is maybe overfitting.

You should also keep in mind that the output of a neural network (with sigmoid activation) is a probability between 0 and 1 and not a binary value 0/1.

Better yet, look at log loss or Brier score! Both. Those are strictly proper scoring rules. — Dave, Jan 04 '21 at 00:44

Rounding output of sigmoid for binary linear classifier

1 Answers1