Questions tagged [nesterov]

4 questions
13
votes
1 answer

How to choose between SGD with Nesterov momentum and Adam?

I'm currently implementing a neural network architecture on Keras. I would like to optimize the training time, and I'm considering using alternative optimizers such as SGD with Nesterov Momentum and Adam. I've read several things about the pros and…
Clément F
  • 1,717
  • 4
  • 12
  • 13
2
votes
0 answers

Nesterov's Momentum (or Accelerated Gradient)

This might be a silly question, but here it is anyway. I'm trying to implement Nesterov's Momentum to extend the gradient descent algorithm that I'm currently using for my neural network, where I'm currently using momentum. Now, I know that…
1
vote
0 answers

How should I set $\vec{\mu}$ in NAdam optimization?

In Dozat 2016 they introduce a sequence of hyperparameters $\mu_0, \cdots, \mu_T$ where $T$ is the total number of iterations. Naturally $T$ is dependent on the convergence of the parameters, so it isn't a fixed sequence ahead of time. Dozat 2016…
DifferentialPleiometry
  • 2,274
  • 1
  • 11
  • 27
1
vote
1 answer

Trying to write Nesterov Optimization - Gradient Descent

Problem: Im unsure if I understood Nesterov Optimization Im writing about Nesterov Optimization, but the notation im using seems different from the references below. I have done it using some books as guides. Would someone please clarify? Let…