Normalization is not at all straightforward, as this question indicates. Consider small numbers of large outliers. Even though they don't contribute to MAD, their final values normalized by MAD/median will be very high in absolute values, probably higher than their final values would be had you normalized by SD/mean. If you are trying to get all your features on a common scale for, say, fair relative penalization in ridge regression, LASSO, or penalized maximum likelihood, even that choice of normalization will affect the results.
In your case with more than 50% identical values, none of the usual candidates for robust measures of scale will work as they all break down in that case. Like MAD, the $S_n$ and $Q_n$ measures developed in the paper you cite break down at 50%. I suppose you could try to use different order statistics than the median in some way, but then you are going back toward the measure of scale being dominated by outliers.
One thing that came to mind (against usual advice) is binning such features to treat them as ordinal variables. In this case binning might not be so bad, if the main interest is whether or not the feature value differs from that single highly prevalent value and, if so, in which direction. That changes this problem into another difficult problem, however, which is how best to normalize an ordinal variable. This page, this page and this page provide entries to the discussion.
It seems that knowledge of the underlying subject matter and what you are trying to accomplish with normalization, rather than a simple algorithm, might provide the best answer to your question.