My comments here are restricted to the 'square-root transformation' part of the
link and do not apply to (much more useful) 'Box-Cox transformations'. In particular I will discuss the use of square-root transformations with two-sample t tests.
Transformation for a specific example. Suppose you have two datasets of Poisson counts, and want to test whether
the population Poisson means are equal. Here are two random Poisson samples
of size 20 each, one from $\mathsf{Pois}(\lambda=10)$ and one from
$\mathsf{Pois}(12).$
set.seed(2019)
x1 = rpois(20, 10); x2 = rpois(20, 12)
sort(x1); sort(x2)
[1] 4 4 5 7 7 7 8 9 9 10
[11] 11 11 11 12 12 12 12 14 14 17
[1] 7 7 9 10 10 10 11 11 12 12
[11] 12 12 12 14 14 14 16 19 21 23
Because the population variances are numerically the same as the Poisson
means, we expect that the sample variance of sample x1
may be smaller
than that of x2
. For our particular samples, this turns out to be true:
var(x1); var(x2)
[1] 12.06316
[1] 17.85263
There are theoretical reasons to believe that if we take square roots of
the Poisson counts, then the variance of the transformed samples will be
more nearly equal. That is also true for our data.
var(sqrt(x1)); var(sqrt(x2))
[1] 0.3376489
[1] 0.3218574
Pooled two-sample t test: Because the pooled two-sample t test requires population variances to be
equal, then it seems reasonable that we would get better results from this
t test by using the square-root transformed data.
Here are the P-values
from the (questionable) pooled t test for original data (0.012) and from the test on transformed counts (0.016). [The pooled t test on the original data is questionable because population variances are unequal if the population means are unequal.]
In both cases, we would reject the null hypothesis that the
Poisson means of the two populations are equal. (Because this is a simulation, we have the advantage of knowing that they are different, and so knowing that this is the correct decision; for real-life data, we would not know).
t.test(x1, x2, var.eq=T)$p.val
[1] 0.01887121
t.test(sqrt(x1), sqrt(x2), var.eq=T)$p.val
[1] 0.01643156
Welch t test: The Welch two-sample t test does not require equal population variances, so
using that test for the original counts should be OK. [Notice that the parameter var.eq=T
is missing from the t.test
procedure here, so the default Welch test is performed.]
t.test(x1, x2)$p.val
[1] 0.01905609
Simulation: So for our data above, it makes no difference in the decision to reject
whether we use the variance stabilizing transformation or not. The
question remains how often this is so. We can settle that for specific
instances with a simulation.
First, we see whether the true significance level of the pooled t test
is 5% with and without the square root transformation. (With 100,000 iterations, we should round P-values to two places of accuracy.) The answer is Yes.
set.seed(2019)
n = 20; lam1 = lam2 = 10
m = 10^5; pv.o = pv.t = numeric(m)
for(i in 1:m) {
x1 = rpois(n,lam1); x2 = rpois(n,lam2)
pv.o[i] = t.test(x1, x2, var.eq=T)$p.val
pv.t[i] = t.test(sqrt(x1), sqrt(x2), var.eq=T)$p.val
}
mean(pv.o <= .05)
[1] 0.04964 # aprx 5%
mean(pv.t <= .05)
[1] 0.05001 # aprx 5%
A similar simulation with $\lambda_1 = 10$ and $\lambda_2 = 12$ uses
the same code, but with lam1 = 10; lam2 = 12
at the start. this enables
us to find the power of the pooled test (with and without transformation).
Both results are close to 46% power.
mean(pv.o <= .05)
[1] 0.45945
mean(pv.t <= .05)
[1] 0.45664
For $\lambda_1 = 10, \lambda_2 = 14,$ both P-values are about 95%.
My conclusion is that for the specific situation investigated here,
it makes no practical difference whether one uses the square-root transformation or not---even with the pooled test where a difference
in variances might have caused trouble.
Notes: (a) Some years ago when teaching related topics I wanted to give
my class a real example with Poisson data where using the square-root transformation made a difference in the analysis using pooled t tests. Failing to find a real-life
example, I started using simulation to see if I could (with an appropriate
confession) produce some example. For all the scenarios I tried (certainly
more than are shown here and including some one-factor ANOVAs), I decided that the square-root transformation
is a nice theoretical idea, but of very limited practical use.
Maybe someone
on this site can find a convincing numerical example that supports the use
of the the square-root transformation for t tests or one-factor ANOVAs.
(b) For more general discussions of uses of variance-stabilizing transformations of count data, search this site for 'square root transformation' or 'variance stabilizing transformation'. Some of the applications discussed there may be helpful for your work. To start, perhaps look here.
(c) in your link I'm wondering if there is a typo in specifying the 'arcsine' transformation for binomial data. I have always seen this explained in terms
of the arcsine of the square root of binomial proportions, not binomial counts. (Otherwise, you're trying to take arcsines of quantities outside $(0,1).$