0

My understanding is that BFGS and L-BFGS solve the same type of optimization problems as GD and it's variants.

Why is GD the go to algorithm for training neural networks?

Skander H.
  • 10,602
  • 2
  • 33
  • 81

1 Answers1

0

You can use L-BFGS for optimization, it is included in some libraries as an optimizer, however it is very memory expensive algorithm, so many times it is more reasonable to use the gradient descent family.