7

I am trying to implement an exponential learning rate decay with the Adam optimizer for a LSTM. I do not want the 'staircase = true' version. The decay_steps for me feels like the number of steps that the learning rate keeps constant. But I am not sure about this and Tensorflow has not stated it in their documentation. Any help is much appreciated.

Suleka_28
  • 235
  • 2
  • 7

1 Answers1

5

As mentioned in the code of the function the relation of decay_steps with decayed_learning_rate is the following:

decayed_learning_rate = learning_rate *
                      decay_rate ^ (global_step / decay_steps)

Hence, you should set the decay_steps proportional to the global_step of the algorithm.

OmG
  • 1,039
  • 10
  • 13