Getting binary class from continuous values of neural network output

Question

I have a custom neural network that I wrote from scratch and it does lot of mathematical computations and the output is a continuous value.

I want to get the binary class output from these continuous values. I applied sigmoid function on these output continuous values but the sigmoid values with the threshold of 0.5 is not giving the correct class labels as for the training data the value after applying sigmoid is all in the range of 0.5055, hence classifying all in tag 1.

I am not able to comprehend if there's any other way I should discretize the values or why sigmoid is not working well for me and if there's an optimal way to get the correct threshold.

One thing I figured out from an answer as to why sigmoid might not be working well could be because of class weights ?

Why is it wrong to predict everything as being in class $1?$ [Why even go from probability prediction to discrete class at all?](https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models/312787#312787) — Dave, Dec 06 '21 at 17:15
Because I printed the training data output along with the original class corresponding to it. So many of those values should be class 0 as well. If I keep 0.5055 as the sigmoid threshold, I am getting many outputs correctly as 0 as well — Alex, Dec 06 '21 at 17:18
Choosing 0.5 is often seen as a default, but as you’ve demonstrated, it’s not always going to yield the desired trade-off of TPR and FPR. Achieving a desirable trade off between TPR and FPR as you vary the threshold is visually displayed in a ROC curve. We can’t tell you what the correct threshold choice is because we don’t know anything about the costs of a misclassification. — Sycorax, Dec 06 '21 at 17:32
@Sycorax can you just guide me as to how to weigh the cost of misclassification given that the model I am building is the identification of a life-threatening disease. And asto how to figure out the best threshold for my sigmoid — Alex, Dec 06 '21 at 17:49
@Alex You might be interested in [a question of mine.](https://stats.stackexchange.com/questions/464636/proper-scoring-rule-when-there-is-a-decision-to-make-e-g-spam-vs-ham-email) Frank Harrell's blog is a good read, too: [(1)](https://www.fharrell.com/post/classification/) [(2)](https://www.fharrell.com/post/class-damage/). — Dave, Dec 06 '21 at 17:51
I would consult a high-quality textbook that addresses using statistical models in a medical context, such as Frank Harrell's Regression Modeling Strategies (and many of his papers). In a medical context, the costs of an error could be small (a patient receives an unnecessary dose of antibiotics) or severe (a patient has a healthy limb amputated unnecessarily). This is why understanding the costs of misclassification is important. — Sycorax, Dec 06 '21 at 17:53
Thanks @Dave and Sycorax. I'll read through these links and follow up on the ideas — Alex, Dec 06 '21 at 17:57
Yes, @StephanKolassa this resonates a lot with what Dave and Sycorax have highlighted. I read this answer earlier but was not able to grasp it well that time. I guess the intuition I had was that if my sigmoid threshold was not 0.5, then my model was wrong, but looks like it doesn't have to be so! — Alex, Dec 06 '21 at 18:02
Exactly. A threshold of 0.5 looks so *intuitively correct* that many people never even stop to think about it. But it may well be suboptimal. (Or per my answer at the linked thread, we may even need *multiple* thresholds, to choose one of *more than two* different possible decisions.) — Stephan Kolassa, Dec 06 '21 at 18:43

Getting binary class from continuous values of neural network output

0 Answers0