How to deal with the loss exploding for LSTM regression task

Question

I am training a LSTM for regression problems, but the loss function randomly shoots up as in the picture below:

I tried multiple things to prevent this, adjusting the learning rate, adjusting the number of neurons in my layers, add l2 normalization, use clipnorm or clipvalue for the optimizer, but nothing seems to help. I checked my input data to see if it contains null/infinity values, but it doesn't, it is normalized also. Here is my code for reference:

Each input data is a sequence, whose shape is (100,2) while the output is (100*1) sequence. I don't know how to deal with such exploding cases. It works fine for easy regression cases, but when I add the complexity of the regression problem (similar to predict a sine wave, the easy case is only the amplitude is different, but the complex case is frequency, amplitude and phase are all different). Appreciate any help!

PS: I have tried to change each hyper parameter and it seems useless. And I try to apply the data from simple sinewave it works fine.

Have you tried scaling your targets, or using a different loss function? Does the problem present if you strip down the network to just one layer? Does the problem present if you change the initialization method? How are the inputs scaled? How are the targets scaled? Are there any very large values in either, or values that are orders of magnitude larger than a typical value? — Sycorax, Jun 29 '21 at 21:06
Thank you for your comment. I have tried to re-scaled the input and the ground-truth value, also use only one layer of LSTM, but the exploding still happens. — uom-tracy, Jun 29 '21 at 21:13
Check for large values after scaling (perhaps arising due to a small variance or a rare value that's far from the mean). Otherwise, you're doing the right thing: trying different learning rates, regularization methods and gradient tricks. It takes a lot of tuning to get a neural network to learn, but LSTMs are especially tricky, so you'll need to try all of these things in combination. The linked thread has more tips that might help. — Sycorax, Jun 29 '21 at 21:59

How to deal with the loss exploding for LSTM regression task

0 Answers0