Say a study aims to see if two variables A and B are correlated in the general population. It isn't clear to me how extraneous variables like gender should be treated when the sample is selected. Each of the two methods of "controlling" for gender that I am aware of (see below) seems non-ideal, so is it a case of selecting the "least unideal" one for the study at-hand?
1) You only use subjects of one gender, because you don't want gender to even exist as a variable. Surely then, any effect you find is vulnerable to the doubt that perhaps it only exists in that gender but not in the other?
2) You use 50% males and 50% females and thus claim that the effect exists in the general population because it exists in your gender-matched sample. Surely though, what you have proved then is an effect that exists when averaging across genders, and thus may even be an effect that perhas doesn't exist when looking within each gender separately.
It seems to me that even if gender effects are not of interest, if an interaction between gender and the independent variable exists, it still needs to be decomposed (broken up), such that potentially, the effect remains in only one gender but not in the other. Method 1) doesn't know what might be happening in the case of the other gender while method 2) averages together what could well be two very different effects, thus possibly either masking two individually-existing effects or creating an artificial effect that does not exist in each gender individually.
Confusing! Help please?