1

is it true that:

After training a multilayer perceptron with one hidden layer using gradient descent, we always get the same decision boundary regardless of the initialization point.

user42493
  • 115
  • 4
  • The titular question does not match the text in the question itself. – Frans Rodenburg Oct 29 '19 at 07:40
  • I thought that the two are equivalent? – user42493 Oct 29 '19 at 12:56
  • Sorry, let me elaborate. A function can have a single global minimum, but that does not exclude the possibility of local minima or [saddle points](https://stats.stackexchange.com/a/279094). Hence, having a global minimum does not mean your optimizer will reach it irrespective of the starting point. – Frans Rodenburg Oct 29 '19 at 13:51

1 Answers1

1

NO.

With one hidden layer, the objective function is not convex (even for squared loss in regression). Therefore, we may have many local minima instead of one global minima.

Using gradient decent and where to end, will depend on the initialization point.

Haitao Du
  • 32,885
  • 17
  • 118
  • 213