If you're using a numerically stable version of the sigmoid function, some version of your proposal is already done to prevent overflow. In the function
$$f(x) = \frac{1}{\exp(-x)+1},$$
very small values of $x$ can cause numerical overflow. So to remediate that, a sigmoid implementation might look something like
def sigmoid(x):
if x < -20.0:
return 0.0
else:
return 1.0 / (exp(-x) + 1.0)
We don't need to worry about the case of large $x$, because if $x$ is too large, then $\exp(-x)$ becomes zero and we simply have $\frac{1}{0+1}=1$.
The value of -20
is chosen to be close to where overflow would result in a NaN
return. A different choice could be more appropriate depending on the floating point precision and the particular usage. In particular, we might want to pick a well-chosen value to preclude erratic behavior near the cusp of loss of precision.
The purpose is not to conserve compute time, because the difference between 0.01 and 0.001 can be very important to your computation. Throwing away that precision could give bogus results, such as stopping a gradient-based method in its tracks because the gradient suddenly becomes zero. Whether or not it's a good idea to compromise your computation's precision to get a small performance increase should be decided on a case-by-case basis, since the cost of imprecision could be very high in one instance but negligible in another.