Step and switch-like functions can be thought of as deterministic switches at some threshold, so smooth sigmoidal-like functions describe that switch with uncertainty around that value.
The machine learning literature discusses 'activation functions' extensively, but the choices are usually justified by application results. I am not aware of an argument for particular functions form first principles.
The heaviside function is sometimes approximated by $H(x) \approx \frac{1}{2} + \frac{1}{2}tanh(rx)$, and I think the gaussian CDF has a more immediate probabilistic interpretation.
Since probability distributions can be described as arising from idealized processes (e.g. gaussian from brownian motion, Gamma from sequence of exponential waiting times, binomial from repeated sampling with a fixed probability) my question is whether there is a canonical choice, or canonical choices depending on the process that generates the step function.
EDIT
The question is not whether there are continuous approximations to the step or Heaviside functions, nor whether there are functions that are popular or commonly used (so it's different from this question) as an alternative to the discontinuous step. The question is whether there are functions that represent probabilistic switch-like behavior and arise naturally from probabilistic reasoning (if not in general, for specific processes that generate some sort of switch).