2

My study has 6 continuous predictor variables and a dichotomous criterion variable. The IRB wants me to provide a power analysis, which I take to mean how I decided on number of participants to recruit.

How do I use a power analysis to compute minimum number of participants needed?

All of the calculators (including G*Power 3) require only one predictor variable, so I'm lost.

Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
user26717
  • 21
  • 1
  • So is this for a logistic regression? – Glen_b Jun 11 '13 at 02:57
  • 3
    One way to deal w/ scenarios that are too complicated for preprogrammed approaches is to simulate. You can read about some of the basic ideas in my answer here: [simulation-of-logistic-regression-power-analysis-designed-experiments](http://stats.stackexchange.com/questions/35940//35994#35994). I don't know if this would be beyond your reach at present, you may need to work w/ a consultant. – gung - Reinstate Monica Jun 11 '13 at 03:13

1 Answers1

3

In a standard logistic regression with $k=6$ predictors, you have 7 main tests of statistical significance. One for each of the six predictors, and one for the overall model. Each of these significance tests will have their own statistical power.

By convention, researchers often aim to have at least 80% power for the most important statistical tests.

Thus, you need to decide what are your important statistical tests. Are you only interested in the overall model prediction or are you interested in the coefficients of the individual predictors?

In most situations, statistical power is greater for the overall model than for the individual predictors. This is because the overall model combines the individual effects. Furthermore, issues of multi-collinearity can make it more difficult to determine which predictor is more important and thus, the standard error on the coefficients increases, and in turn statistical power for the coefficients is reduced.

Gung has provided an excellent previous answer about power analysis for logistic regression.

If I were doing a power analysis for logistic regression, I'd do a simulation. i.e.,

  1. Write some code that generates the data given assumptions about intercorrelations of predictors and the combined effect of predictors on the binary DV, where each dataset takes the same specified sample size.
  2. Simulate datasets 1,000 or more times
  3. For each simulation, test whether effects of interest (e.g., individual coefficients or overall model) is statistically significant
  4. The proportion of statistically significant simulations for given effect is your statistical power

Repeat the simulation above for various sample sizes until you get a sample size that provides you with adequate power for your effects of interest.

Note that statistical power will vary substantially based on your assumptions you make regarding intercorrelations of predictors, strength of relationship between predictors and the criterion, and the degree to which the two groups have equal sample sizes.

Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250