I've been reading this tutorial on variational bayes which talks about sparse Bayesian learning (Relevance Vector Machines if you prefer). In the paper they put a Gamma prior on the precision parameter. In this instance $Ga(\theta|\delta,\delta)$, where $\delta$ is small so that $Ga(\theta|\delta,\delta)\approx\frac{1}{\theta}.$
My question is why or when would you put a gamma prior on precision and not variance. I understand that in the case of Sparse Bayesian learning, it was useful to put this prior on the coefficient/ weight precision since it effectively gave you a sparse prior on the weights.
However, for general regression problems it would make sense to think that smaller variances are preferred, thus we should put a $Ga(\theta|\delta,\delta)$ on the variance. Putting this prior on the precision would imply smaller precisions are preferred. Or subsequently am I interpreting the prior wrong?