2

In my oral exam one of my teachers asked whether the population exists in an experimental study.

I answered: "Yes. the population exists in an experimental study. And I will get it from my interested subject."

But he rejected the answer and told me that the population does not exist in an experimental study but exists in an observational study.

I have not understood it. If the population does not exist in an experimental study, then how can I get my sample ?

Scortchi - Reinstate Monica
  • 27,560
  • 8
  • 81
  • 248
time
  • 1,167
  • 5
  • 15
  • 31
  • 11
    Without undue punning, much depends on what is meant by "exist" and indeed of "population". It is too late, but I would have asked your teacher what meaning is to be attached to inference with experimental data if no population exists. Even in simple cases such as tossing coins or throwing dice, the population is an abstraction rather than something that can be identified concretely. – Nick Cox Jan 14 '14 at 15:04
  • 5
    In support of what @Nick Cox has said, I wish to emphasize that a "population" is a [*modeling construct* that enables us represent phenomena with random variables](http://stats.stackexchange.com/a/54894/919). Sometimes there is nearly a one-to-one correspondence between this construct and some physical population, but this is impossible in settings where the data are produced by a *process*. Although the distinction between a physical population and physical process does not quite capture the difference between observational and experimental studies, it may be what the teacher was thinking. – whuber Jan 14 '14 at 16:03
  • 1
    Furthermore experiments are sometimes analyzed without even the most hypothetical reference to sampling from a population: inference can be based on the random assignment of experimental treatments. – Scortchi - Reinstate Monica Jan 14 '14 at 16:10
  • @NickCox Did you want to mean it depends whether experimental data exists . For example, (1) in tossing coin several times the population does not exist (2)And , if i suppose there are 50 experimental units and i will divide it into two groups, treatment group and control group , then doesn't population exist here ? – time Jan 15 '14 at 00:13
  • 2
    I think you're getting consistent answers here. The concrete issue is exactly what your teacher meant and we can't expand on quite what that was. – Nick Cox Jan 15 '14 at 03:07
  • 1
    I would guess that the teacher would accept different answers, as long as the answer was justified by defining your concepts appropriately given the context. According to my intuition, the *sample* is the set of units that we have for analysis and the *population* is the theoretical or actual set of units we care about making an inference for (or alternatively, the set of units from which you draw your sample -- i.e, the sampling frame). If the case is like @Scortchi described, with no reference to a sampling frame, then I would consider the sample to be equivalent to the population. – Dr. Beeblebrox Jan 15 '14 at 08:46

2 Answers2

2

I found the best answer (IMO) in a dated but wonderful book: Arthur S. Goldberger, A Course in Econometrics, 1991, Chapter 16.

In experimental studies the $\mathbf{x}_i$ define $n$ "subpopulations", or strata. A random drawing is made for each stratum, that is, $y_1$ is drawn from stratum 1, $y_2$ is drawn from stratum 2, and so on. In this scheme, the sampled $y$'s are not identically distributed; they are drawn from different strata. The researcher controls the $\mathbf{x}$ values and imposes them on the subjects; he defines (we could say: he creates) the relevant "subpopulations", or strata. Furthermore, the list of $n$ selected $\mathbf{x}$ vectors and their values is maintained in repeated sampling, so:

  • the expectations of the successive $y$'s will depend only on $i$, and one can then write $E(y_i)$ instead of $E(y_i\mid\mathbf{x}_i)$ (the $\mathbf{X}$ matrix is typically nonstochastic);
  • it does not make sense to use the sample to estimate the population means and variances of the $x$'s and $y$; the sample on $\mathbf{x}$ is not randomly drawn from the population joint distribution of $\mathbf{x}$ (is controlled by the researcher), and consequently the sample on $y$ is not randomly drawn from the population marginal distribution of $y$.
Scortchi - Reinstate Monica
  • 27,560
  • 8
  • 81
  • 248
Sergio
  • 5,628
  • 2
  • 11
  • 27
0

Can it be explained like the following so that I can say the population is non-existing ?

suppose we are interested about a new drug whether it removes headache or not. A sample of 50 people who have headache are being taken and the new drug and placebo are randomly given to the units. Here we didn't take the sample from the population that have headache and take the drugsince we couldn’t identified the population physically , rather we randomly gave the new drug to the 25 people who have headache. and after test procedure based on this 25 peolple we are talking about the whole population that if the population have headache and they were given the drug,then the result would like that at $alpha=0.5$ level of significance.

If the population do exist here than we would first identified the population here that have headache and take the drug ; and then we would take a random sample of $25$ and based on the sample we would draw the inference here.

The only difference I am getting here is that :

$\bullet$In sampling design, we first identify the population and randomly select our sample. We didn't apply here any treatment and this is observational study.So here population is existing here.

But

$\bullet$In experimental design we first draw our sample and think that if my population would fulfill the condition like my sample, then it would also produce the result at $\alpha$ level of significance. So here population is non-existing here.

time
  • 1,167
  • 5
  • 15
  • 31