One or two output neurons for a binary classification task with an artificial neural network

Asked Nov 10 '14 at 16:19

Active Jun 15 '16 at 16:52

Viewed 1,819 times

Suppose you have a classification problem in which you want to classify inputs into two exclusive classes (y1 and y2) with an artificial neural network (which models P(y|x)).

Among the two following architectures for the output layer, which one is better to model P(y|x)?

use two output neurons, one for each class, with a softmax activation function. If a1 and a2 are the outputs of the two output neurons, P(y=y1|x)=a1 and P(y=y2|x)=a2 with a1+a2=1.
use a single output neuron with a sigmoid activation function. If a is the output of the neuron, we can set P(y=y1|x)=a and P(y=y2|x)=1-a.

I can see the two following advantages, which suggests that the choice depends on the specific problem:

In 1., the last layer has twice more parameters than in 2. and thus has more flexibility and can potentially model more complicated relationships.
In 2., the last layer has twice less parameters than 1. and thus is less prone to overfitting.

edited Nov 10 '14 at 16:25

asked Nov 10 '14 at 16:19

AdeB

@MarcClaesen If you have two neurons (or more) with a softmax activation, the sum has to be one. – AdeB Nov 10 '14 at 19:22
oops, read over the softmax activation. – Marc Claesen Nov 10 '14 at 20:26
Check this out: http://stats.stackexchange.com/questions/207049/neural-network-for-binary-classification-use-1-or-2-output-neurons?noredirect=1&lq=1 is not the same? And so lesser the output number the better as it will update faster? – Peter Teoh Jan 30 '17 at 15:03
1

How ridiculous! The original question is blocked for the duplicated one. Actually this question was asked before than the another one. – hafiz031 Apr 23 '20 at 14:45

One or two output neurons for a binary classification task with an artificial neural network

0 Answers0

Linked