1

According to the answer here: How to choose the number of hidden layers and nodes in a feedforward neural network?

How many hidden layers? Well, if your data is linearly separable (which you often know by the time you begin coding a NN) then you don't need any hidden layers at all.
  1. Why this is true?

  2. If the data is linearly separable:

    2.1 Do we need only to use input and output layers?

    2.2 Does the activation function on the output layer will do the logic of the separation? (Is it enough)?

Ariel
  • 2,273
  • 2
  • 23
user3668129
  • 115
  • 3

1 Answers1

3

Yes, hidden layers are not needed for linearly separable data. Because, the output layer already calculates a linear combination of features and outputs a number that has discriminative power, i.e. $f(\sum w_ix_i + b)$, where $w_i$ are output neuron's weights, $b$ is bias and $f$ is the activation function. Linear separability means that there exist a hyperplane separating the classes in the feature space, i.e. $\sum w_ix_i+b$. Activation functions also matter, but as long as they translate to a decision rule of the form $\sum w_ix_i+b >\tau$, the form of activation is not important, e.g. tanh, sigmoid. So, you don't need hidden layers trying to discover a more complex decision boundary. It's already been discovered by using only the output layer. And, you have input layer as always, which just represents your features.

gunes
  • 49,700
  • 3
  • 39
  • 75
  • Thanks. How the activation function influence the output in this case ? – user3668129 Jul 05 '21 at 12:32
  • For example, while learning, it helps you to define a proper loss function like log-loss, instead of mse, where the problem can turn out to be non-convex – gunes Jul 05 '21 at 12:33