Adjusting any power analysis with FPC?

Question

Is it correct (perhaps at least as a good approximation) to determine sample size via a power analysis that assumes an infinite population and then to adjust the required sample size by a finite population correction?

For example, I am using the formula from Fleiss:

enter image description here

To calculate the sample size required to test the difference between two independent proportions, with unequal sample sizes.

If I get out of this formula that the two samples sizes are:

n1= 300

n2= 600

(as r in the formula, the ratio n2/n1, was set to 2) can I then apply the FPC to the total sample size to correct for the finite population?

enter image description here

For example, say the population size is really 1000. Then, can I apply the correction to the total sample size (300+600=900) and recalculate the sample sizes from the two populations as:

new_total_sample = $\frac{900*1000}{900+(1000-1)}=474$ and then allocate between the two populations as new_n1 = 158 and new_n2 = 316?

Thanks!

ADD After Dr. Lumley's response:

I. I am a little confused on the difference between a hypothesis test and a confidence interval for a finite population difference. Are you saying it is legitimate to estimate the confidence interval of say, the difference between the proportion of men and women answering 'Yes' but not a p-value to determine if it is zero (i.e. to reject the null)?

In the language of your package, I am thinking of

design  <- svydesign(id=~1, fpc=~N,data=experimentData)
svyglm(respondedYES~as.factor(isMale), design=design, family=binomial)

where you would construct a CI around the coefficient for isMale

II. Also, I understand I think, the issue with the example of how the proportion of men and women could never be the same given your lack of a common factor argument, but what about hypothesis testing with a finite population in the case of an experimental design (where the researcher controls the treatment assignment) - where this issue is not present? That is the realm I am trying to learn how (if) the fpc can be used.

score 8 · Answer 1 · answered Feb 01 '14 at 22:11

The typical rule is that fpc isn't used in testing, because the finite-population null hypothesis is typically not of interest. For example, if you had 1048576 men and 177147 women in a population it is not possible that the proportion answering "Yes" to a question could be the same for men and women (unless it's 0 or 1), since these number have no common factor. And if the validity of your null hypothesis depends on the the common factors of large unknown numbers, it is very unlikely to be a sensible null hypothesis.

However. It is entirely reasonable to be interested in the confidence interval for a finite population difference, and to be interested in whether this interval includes numbers that are close to zero or not. This can be genuinely a finite population question, in the sense that there would be no uncertainty if exact census data were available. In that situation, the calcuations you would do would be formally identical to those you would ordinarily do to calculate the power for testing equality.

I would argue that there are at least two reasons software supports testing using fpc. First, it is important for advertising (though less so for practical use) to support standard error calculations with fpc, and it's then easier not to specially disable testing. Second, people often use one type of survey analysis as an approximation to another, eg, single-stage with replacement analysis of multistage without-replacement surveys; single-phase approximations to two-phase designs; post-stratified designs as approximations to multi-frame sampling. As an implementor, you don't want to prohibit unforeseen uses. I would be annoyed by software that did more than give a warning.

In situations where fpc is a practical issue rather than just a philosophical one it is almost always straightforward to decide whether you want a population or a superpopulation analysis, just by asking what you would do if you had data on the whole population. In a lot of cases it doesn't matter, because the sampling fraction is low (except possibly in a few strata where it is 100%).

+1, welcome to the site, @Thomas Lumley. We hope you'll register your account and come back in the future. — gung - Reinstate Monica, Feb 01 '14 at 23:08
Thanks for this answer Dr. Lumley! If you dont mind, I have added an addition to my original question above as a follow-up. — B_Miner, Feb 02 '14 at 03:26

Steve Samuels · Accepted Answer · 2014-02-06T23:32:56.820

Update: 2014-02-06: changed text to be more emphatic that fpc should not be used in a causal analysis **Update: 2014-02-04: impact of the randomized experimental design

This question has raised some fundamental issues.

You stated in your update that a researcher can control the make-up of the experimental groups. Not so. Even if one randomized an entire population, there would be imbalance, perhaps trivial, in every variable. Even with some kind of balancing algorithm, which would destroy the randomization, one can never arrange for identity of the means of the outcome variable, yet unmeasured.

You also asked Tom Lumley:

Are you saying it is legitimate to estimate the confidence interval of say, the difference between the proportion of men and women answering 'Yes' but not a p-value to determine if it is zero (i.e. to reject the null)?

I think that's what Tom meant, and I agree with its application to descriptive statistics; ~~I'm not sure that it applies~~ It does not apply to causal analyses, including those generated by an experiment. Your particular example is a borderline case, as you intend the results to apply to a single population at a particular time. If someone asked you to project your findings to another setting or to another time period, the confidence interval calculation ~~probably~~ should not include the fpc.

Some additional insight can be gained by considering the experimental design as part of the sample design. If the initial random sample is of size $n$, randomization produces two random sub-samples of size $n_1 = n/2$ and $n_2 = n/2$. (For the theory that follows, $n_1$ and $n_2$ need not be equal.) Let $\overline{y}_1$ and $\overline{y}_2$ be the means of the sub-samples; proportions are special cases. In this scenario, which conforms to the absence of a treatment effect, it can be shown (Cochran, 1977, problem 2.16, p. 48) that:

\begin{equation} Var(\overline{y}_1 -\overline{y}_2) = S^2\left(\frac{1}{n_1} +\frac{1}{n_2}\right) \end{equation}

where $S^2$ is the population variance and variation is over repetitions of the sampling and randomization. Notice: no fpc.

Update: one of the few established uses of hypothesis tests + FPCs for finite populations: lot quality assurance sampling (LQAS)

I agree with Tom's answer. Hypothesis testing rarely has a place in finite population questions, but confidence intervals certainly do. One good use of hypothesis tests per se in finite populations is lot quality assurance sampling (LQAS), which tests whether the rate of some event (e.g. vaccination) in a geographic area is too high or too low. Note that, unlike the question at hand, there is no hypothesis of zero difference. The null hypothesis is that the rate is < K, and the alternative that is it is $\geq$K. See, at Google Scholar.

Robertson, Susan E, Martha Anker, Alain J Roisin, Nejma Macklai, Kristina Engstrom, and F Marc LaForce. 1997. The Lot quality technique: a global review of applications in the assessment of health services and disease surveillance. Relation 50, no. 3/4: 199-209.

Lemeshow, Stanley, and Scott Taber. 1991. Lot quality assurance sampling: single-and double-sampling plans. World Health Stat Q 44, no. 3: 115-132.

Original Answer

Using the fpc to reduce sample size makes no sense unless intend you use it in the the hypothesis-testing statistic. But that would be an error: the fpc should not be used when testing hypotheses [added about "no difference"].

The reasoning is interesting (Cochran, 1977, p.39): It is seldom of scientific interest to ask if a null hypothesis (e.g. that two proportions are equal) is exactly true in a finite population . Except by a very rare chance, the null hypothesis will never be true, as one would discover by enumerating the entire population. Therefore hypothesis tests on samples from finite populations are done from a "super-population" viewpoint. See also Deming (1966) pp 247-261 "Distinction between enumerative and analystic studies"; Korn and Graubard (1999), p. 227.

References

Cochran, W. G. (1977). Sampling techniques (3rd ed.). New York: Wiley.

Deming, W. E. (1966). Some theory of sampling. New York: Dover Publications.

Korn, E. L., & Graubard, B. I. (1999). Analysis of health surveys (Wiley series in probability and statistics). New York: Wiley.

Steve, but what about the case where say a business has 2,500 customers of a certain type and that stays pretty much static over time (the near term at least). You want to conduct an experiment to determine what the best method is to get them to respond (in some way). If a traditional power analysis suggests that you need 4,000 subjects, what do you do? It seems there must be a way to use the FPC? — B_Miner, Jan 31 '14 at 23:06
And would not that rule of not doing any hypothesis testing with fpc be counter to the practice of survey analysis where it seems a fpc is routinely used - for example surveyreg in SAS or the svyglm in R? — B_Miner, Jan 31 '14 at 23:29
This is likely the closest example- a logistic regression is used to draw inference about a finite population, where each sampled individual is shown 1 of 3 treatments and asked to give feedback. This is a much more complicated example because there are stratum and the response is ordinal etc. but the spirit seems exactly how I will analyze the data - using a hypothesis test with a finite population. http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_surveylogistic_sect012.htm — B_Miner, Feb 01 '14 at 00:19
In the SAS documentation example, I consider the the use of the fpc for the hypothesis tests to be questionable. But the ORs, although model-based, can be considered descriptive, and so I think the CIs are okay. In this case, the fpc can have made very little difference so, for consistency, I would have done the analysis without them. — Steve Samuels, Feb 04 '14 at 18:47
Really appreciate you follow-up Steve! I am understanding this better. From this and others, it does appear that the SAS example is questionable. What also helped me reconcile my specific situation is in thinking about if I really have a finite population. The defining thing is if I would expect that on repeated experiments, the same people would respond the same way over time. That answer for me is no. Thus, it seems i cant really enumerate the "population" so I am dealing with a superpopulation and the fpc is not valid. — B_Miner, Feb 05 '14 at 15:26
You're very welcome, Brian. Your continuing questions induced me to rethink this entire issue. — Steve Samuels, Feb 05 '14 at 20:48

Adjusting any power analysis with FPC?

2 Answers2

Linked