0

I'm starting an experimental study which has ~20 independent variables. Variables are all parametric and orthogonal and will be manipulated together in a pseudo random manner (e.g. each trial will include a specific level of either all or a subset of the variables).

If I would like to estimate the impact of each of these variables on the dependant measure, how many trials and participants should I aim for.?

Rman
  • 1
  • As many as you can. – user2974951 Jan 24 '19 at 11:56
  • clearly. minimum? – Rman Jan 24 '19 at 12:04
  • That depends on your objective and your specifications. I think you want to perform power analysis, in that case we will need some relevant information: your desired $\alpha$, power level and standard deviation of your data. – user2974951 Jan 24 '19 at 12:07
  • ideal power at 0.8 but no data and no previous studies to work with. – Rman Jan 24 '19 at 12:13
  • That is a problem. You need to either: 1) get some initial data to estimate these parameters, 2) find previous related work on this subject and use their estimates, 3) use domain knowledge to guesstimate data distribution. – user2974951 Jan 24 '19 at 12:16
  • Have a look at https://stats.stackexchange.com/questions/10079/rules-of-thumb-for-minimum-sample-size-for-multiple-regression/10105#10105 and https://stats.stackexchange.com/questions/35940/simulation-of-logistic-regression-power-analysis-designed-experiments/35994#35994 for some guidance. – user2974951 Jan 24 '19 at 12:21

1 Answers1

0

This question requires precise specifications to get precise answers... however, if no such information is available we can still use a rough estimate using some standard values. I will be conducting the power analysis using R and the pwr package.

In the pwr package we can perform this using the pwr.f2.test for general linear models. For this I will need:

  1. $u$, number of coefficients minus intercept
  2. $v=n−u−1$, from which we can get $n=v+u+1$
  3. $f_2$, the effect size, where you can use some standard values for small, medium and large effect: 0.02, 0.15 and 0.35
  4. $\alpha$, significance level
  5. power

So, in your case, $u=20$, $v$ is left NULL, $f_2$ we will choose medium effect, $\alpha=0.05$ and power is 80 %.

> pwr.f2.test(u=20,v=NULL,f2=0.15,sig.level=0.05,power=0.8)

     Multiple regression power calculation 

              u = 20
              v = 135.071
             f2 = 0.15
      sig.level = 0.05
          power = 0.8

Hence, $n=v+u+1=135+20+1=156$ samples.

user2974951
  • 5,700
  • 2
  • 14
  • 27