I noticed some interesting behaviour in my loss history while training my model.
Please note the sudden change in test loss at around epoch 106. A similar drop will appear around epoch ~1000.
It seems to me that the optimizer is able to escape a local minima, is this correct?
Can anyone explain this behaviour to me?
This is with keras version 2.2.4 and the loss function is mean absolute error