How to calculate the effect size for a t-test?

Question

The pwr package includes the function pwr.t.test, which relates the following four quantities:

sample size
effect size
significance level = P(Type I error)
power = 1 - P(Type II error)

Given any three, we can determine the fourth.

What is the formula powering this function? I tried looking at the code but could not make sense of it. I want to implement an effect size calculator outside of R (i.e., I want to supply the sample size, significance level, and power, to determine the effect size).

I looked at several pages, questions, and vignettes—many offered contradictory calculations or ones that did not make sense to me.

dbwilson · Accepted Answer · 2019-01-17T14:55:49.387

Have a look at this related question. In particular look at Lehr's rule. This provides an approximation: $$n = \frac{16}{\Delta^2},$$ where $\Delta$ is the proposed effect size ($d$). This returns the approximate $n$ for each group. This is similar to @user2974951 answer but I believe is more precise.

To get exact results, you need to work with the t-distribution directly, determining the t-value for given p-values and degrees of freedom, etc.

ADDENDUM:

The approximation above is based on the following. Recall that Cohen's $d$ can be calculated from $t$ and the group sample sizes as: $$ d = t \sqrt{\frac{n_1 + n_2}{n_1 n_2}} ~.$$ If the sample sizes are equal, this can be simplified: $$ d = t \sqrt{\frac{2}{n}} .$$ We can further manipulate this for the purpose of power analysis: $$ d^2 = \frac{t^2 2}{n};~~ therefore~~ n = \frac{t^2 2}{d^2} .$$

In power analysis we are interested in the assumed (true) population effect size ($\Delta$) and need a $t$ value associated with that effect size with our desired power-level. We will start by determining $\Delta$ for a given sample-size, alpha-level, and power-level. $$ \Delta = (t_{1-\alpha/2,df} + t_{power,df})\sqrt{\frac{2}{n}} ~.$$ Where the $t$-values are the critical values for the $t$ at our two-tailed alpha-level and for the $t$ at our power-level, give a specific degrees-of-freedom. If we rearrange this equation to solve for $n$, we get: $$ n = \frac{(t_{1-\alpha/2,df} + t_{power,df})^2 2}{\Delta^2} ~.$$ However, we have a problem. We need to know $n$ to know the degrees-of-freedom for $t$. That is, we have $n$ on both sides of the equation. The solution is to iterate through the prior equation that solves for $\Delta$ to find the $n$ that returns to desired $\Delta$ for our alpha and power levels.

Using the value 16 in the numerator of the above equation, however, produces a good approximation. This is the numerator associated with a sample size (in each group) of 52.428. If the approximation returns a sample size greater than 52.428, then it is a slight over-estimate, as we see above on your example (the approximation returns 64 compared to the exact solution of 63.77. If the approximation returns a sample size less than 52.428, then it is an under-estimate of the needed sample size. I haven't explored at which point the under estimation becomes severe but suspect based on where $t$-values really start to grow that it is for values less than around $n=20$.

Thanks for the addendum—quite informative! Your last paragraph prompts the question—how well does the approximation work when dealing with very large samples (n = 40,000)? — Khashir, Jan 17 '19 at 18:47
With n = 40,000, sig level = .05, and power = .8, you have enough power to detect an effect size of about .02 (very small). Working with an effect size of .02, the approximation produces a needed n of 40,000 whereas the exact method indicates you only need 39,245. Close enough for power analysis. — dbwilson, Jan 17 '19 at 20:56
Thanks for this—when testing it out, I realized my use case is slightly different (I have a sample size and probability of success, that need to account for). I'll ask a separate question later on. Cheers! — Khashir, Feb 04 '19 at 20:21

score 1 · Answer 2 · answered Jan 16 '19 at 10:59

To keep it short:

If the goal is $80 \%$ power to detect a difference of $\Delta$, with a study of size $n$, equally divided between the two groups, then the required sample size is $n=2(\sigma_1^2+\sigma_2^2)(2.8/\Delta)^2$. If $\sigma_1=\sigma_2=\sigma$, this simplifies to $(5.6\sigma/\Delta)^2$.

To have $80 \%$ power, the true value of the parameter must be $2.8$ standard errors away from the comparison point: the value $2.8$ is $1.96$ from the $95 \%$ interval, plus $0.84$ to reach the $80th$ percentile of the normal distribution.

For example: suppose previous results show a difference of $0.5$ standard deviations, that is $\Delta=0.5\sigma$. To have $80 \%$ power to detect an effect size, it would be sufficient to have a total sample size of $n=(5.6/0.5)^2=126$, or $n/2=63$ in each group.

With R:

> pwr.t.test(n=NULL,
           d=0.5,
           sig.level=0.05,
           power=0.8,
           type="two.sample",
           alternative="two.sided")

     Two-sample t test power calculation 

              n = 63.76561
              d = 0.5
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

  NOTE: n is number in *each* group

Note that the two results are not identical, but close enough.

So, this formula can be rearranged to

$$\Delta=\dfrac{5.6}{\sqrt{n}}$$

How to calculate the effect size for a t-test?

2 Answers2