How to simulate a custom power analysis of an lm model (using R)?

Question

Following to the recent questions we had here.

I was hopping to know if anyone had come across or can share R code for performing a custom power analysis based on simulation for a linear model?

Later I would obviously like to extend it to more complex models, but lm seems to right place to start.

score 4 · Accepted Answer · edited Aug 19 '21 at 18:16

4

I'm not sure you need simulation for a simple regression model. For example, see the paper Portable Power, by Robert E. Wheeler (Technometrics , May, 1974, Vol. 16, No. 2). For more complex models, specifically mixed effects, the pamm package in R performs power analyses through simulations. Also see Todd Jobe's post which has R code for simulation.

edited Aug 19 '21 at 18:16

kjetil b halvorsen

63,378
26
142
467

answered Aug 19 '10 at 17:13

ars

12,160
1
36
54

score 3 · Answer 2 · answered Aug 19 '10 at 06:35

Here are a few sources of simulation code in R. I'm not sure if any specifically address linear models, but perhaps they provide enough of an example to get the gist:

Benjamin Bolker has written a great book Ecological Data and Models with R. An early draft of the whole book along with Sweave code is available online. Chapter 5 addresses power analysis and simulation.

There's another couple of examples of simulation at the following sites:

Tom · Answer 3 · 2019-05-27T02:36:28.533

Adapted from Bolker 2009 Ecological Models and Data in R. You need to declare the strength of the trend (i.e slope) you wish to test. Intuitively a strong trend and low variability will require a small sample size, a weak trend and large variability will require a large sample size.

a = 2  #desired slope
b = 1  #estimated intercept
sd = 20  #estimated variability defined by standard deviation
nsim = 400  #400 simulations
pval = numeric(nsim)  #placeholder for the second for loop output
Nvec = seq(25, 100, by = 1)  #vector for the range of sample sizes to be tested
power.N = numeric(length(Nvec))   #create placeholder for first for loop output
for (j in 1:length(Nvec)) {
  N = Nvec[j]  
  x = seq(1, 20, length = Nvec[j])  #x value length needs to match sample size (Nvec) length
  for (i in 1:nsim) {   #for this value of N, create random error 400 times
    y_det = a + b * x
    y = rnorm(N, mean = y_det, sd = sd)
    m = lm(y ~ x)
    pval[i] = coef(summary(m))["x", "Pr(>|t|)"]  #all the p values for 400 sims
  }  #cycle through all N values
  power.N[j] = sum(pval < 0.05)/nsim  #the proportion of correct p-values (i.e the power)
}
power.N
plot(Nvec, power.N)  #need about 90 - 100 samples for 80% power

You can also simulate for what is the minimum trend you could test for a given sample size, as shown in the book

bvec = seq(-2, 2, by = 0.1)
power.b = numeric(length(bvec))
for (j in 1:length(bvec)) {
  b = bvec[j]
   for (i in 1:nsim) {
     y_det = a + b * x
     y = rnorm(N, mean = y_det, sd = sd)
     m = lm(y ~ x)
     pval[i] = coef(summary(m))["x", "Pr(>|t|)"]
     }
   power.b[j] = sum(pval < 0.05)/nsim
  }
 power.b
 plot(bvec, power.b)

How to simulate a custom power analysis of an lm model (using R)?

3 Answers3

Linked