0

In my scenario, I use deep reinforcement learning to fix a problem that is related to transportation. During training, I plot the gradient and loss, I find that the gradient converges and then explodes while the loss can not converge all the time. I wonder what caused such result.

The fist graph indicates one of the output gradient and the second graph indicates the cumulative reward for each sample(red curve), and corresponding loss(blue curve) enter image description here

enter image description here

mac wang
  • 75
  • 1
  • 9
  • Most likely you are encountering some kind of singularity in your loss function. Could you post the exact loss function here? – Alex R. Dec 05 '17 at 00:01
  • The loss is the sum of absolute error between real data and the estimated result and only used to indicate the performance of action, it is not used for training. – mac wang Dec 05 '17 at 00:42
  • Your loss function depends on hyperparameters, inputs and the structure of your neural network along with activations in the last layer. Please post a more detailed description that includes these. Otherwise this question is unanswerable. – Alex R. Dec 05 '17 at 00:51
  • Sorry, the loss is only an indicator, it is the reward function that is used to calculate the gradient in reinforcement learning. – mac wang Dec 05 '17 at 00:55

0 Answers0