In Bayesian statistics, normalization corresponds to the choice of a prior. For ElasticNet the prior takes the form (Lin and Lin, 2010)
$$
\pi(\boldsymbol\beta) \propto \exp\left\{ -\lambda_1 \| \boldsymbol\beta \|_1 - \lambda_2 \| \boldsymbol\beta \|_2^2 \right\}
$$
This distribution is unnormalized. The paper that you refer to by Hans (2011) "broadens the scope of the Bayesian connection by providing a complete characterization of a class of prior
distributions that generate the elastic net estimate as to the posterior mode." The author proposes a normalized prior distribution that can be considered as an equivalent of ElasticNet normalization. Details and proofs can be found in the paper.
$\ell_0$ regularization can be thought (Polson and Sun, 2017) as using a prior that is a mixture of Dirac delta centered at zero $\delta_0$ and a Gaussian
$$
\pi(\beta_i) \propto (1 - \theta) \delta_0 + \theta\, \mathcal{N}(0, \sigma_\beta^2)
$$
You asked in the comment if $\pi(\beta)\propto \exp\{-\lambda \|\beta\|_0\}$ would be a proper prior equivalent to $\ell_0$ regularization. First, recall that $\ell_0$ is not a proper norm. Second, think of what would this prior do: it would put probability mass of zero at parameters equal to exactly zero and constant probability mass on all the other values. It would be an improper prior, it wouldn't also do much about regularization because it would be basically uniform for any non-zero value. That is why the prior above has two components, for the probability mass for zeros $(1 - \theta) \delta_0$ and non-uniform component for all the other values $\theta\, \mathcal{N}(0, \sigma_\beta^2)$. It worked differently than $\ell_0$ regularization, but the regularization itself is almost never used because even from an optimization point of view it's problematic.
Notice however that there are multiple priors that can lead to sparse solutions (see Van Erp et al 2019), where the priors that correspond more closely to the penalties do not necessarily work equally well as the traditional penalties. The priors may be thought of as mathematically equivalent, but other methods of estimating the models, different implementations, and other technical nuances may lead to differences in the results and other priors may be preferable.