There are a couple of ways you can do this. The first would be to use the mean of the posterior for each of the $\mu_i$, and calculate a residual using this as the "estimated value" corresponding to $\hat{\beta}X$ in OLS. You then calculate the variance of the residuals as usual and plug it into the $R^2$ calculation. You would do this in R, of course. An alternative would be to use the posterior mean of the variance ($1/\tau$) as the estimate of residual variance in the $R^2$ calculation, again done in R. The former comes closest to how $R^2$ is calculated in classical statistics.
No doubt there are other approaches, which (hopefully) others will point out in their answers.
However, the bigger issues are a) with $R^2$ as a criterion and b) with comparison of OLS estimation to anything else using $R^2$ as a criterion. I'll skip over the first, pointing you to statisticalengineering.com and Andrew Gelman as references. The second issue arises because OLS maximizes $R^2$ (a consequence of the "least squares" property) and therefore no other technique (that is not equivalent to OLS) will generate as high an $R^2$. Consequently, your Bayesian approach is doomed if maximizing $R^2$ is the criterion of choice.
You might be able to suggest a more out-of-sample criterion instead, for example, a k-fold cross-validation, which would necessitate multiple runs of JAGS on subsets of the data, then comparing the out-of-sample predicted values to the actual values. You can generate the predicted values inside JAGS as observed in the answer to Missing values in response variable in JAGS, or in R of course.
I'll also point out that the dgamma(0.01,0.01)
distribution has largely fallen out of favor, as it is actually quite informative near zero. The answers to priors for lognormal models might help with that.