In Matlab, I understand that if a number gets closer to zero than realmin
, then Matlab converts the double to a denorm . I am noticing this causes significant performance cost. In particular I am using a gradient descent algorithm that when near convergence, the gradients (of my bespoke neural network) drop below realmin
such that the algorithm incurs heavy performance cost (due to, I am assuming, type conversion behind the scenes). I have used the following code to validate my gradient matrices so that no numbers falls below realmin
:
function mat= validateSmallDoubles(obj, mat, threshold)
mat= mat.*(abs(mat)>threshold);
end
Is this usual practice and what value should threshold
take (obviously you want this as close to realmin
as possible, but not too close otherwise any additional division operations will send some elements of mat
below realmin
after validation)?. Also, specifically for neural networks, where are the best places to do gradient validation without ruining the network's ability to learn?. I would be grateful to know what solutions people with experience in training neural networks have? I am sure this is a problem for all languages. Tentative threshold
values have ruined my network's learning.