I recently came across a statement in classwork that confused me:
the sample distribution is the distribution of the sample
I wasn't sure about the validity of this statement. Specifically, shouldn't it have the word "statistic" at the end?
I recently came across a statement in classwork that confused me:
the sample distribution is the distribution of the sample
I wasn't sure about the validity of this statement. Specifically, shouldn't it have the word "statistic" at the end?
The distribution of the sample statistic (i.e., if you add "statistic" to the end) is the sampling distribution. That is a very different thing from the sample distribution.
The quoted text is correct: the distribution of the sample is the sample distribution. The distribution that a sample statistic would have, if you repeated your study identically over and over indefinitely, is the sampling distribution. It may help you to look at the figure at the top of my answer here: Strategies for teaching the sampling distribution.
Suppose independent observations are drawn from a normal distribution with unknown mean $\mu$ and known standard deviation $\sigma = 1$. Then the distribution of a sample of size $n$ is multivariate normal; i.e., if $$\boldsymbol X = (X_1, \ldots, X_n) \sim \operatorname{Normal}(\boldsymbol \mu, \boldsymbol \Sigma = I_n)$$ where $\boldsymbol \mu = (\mu, \ldots, \mu)$ and $I_n$ is the $n \times n$ identity matrix. This is what we would call the sample distribution. Now, if I use the sample to estimate the parameter $\mu$, then it's clear that one such estimator is the sample mean $$\bar x = (x_1 + \ldots + x_n)/n,$$ and the distribution of the sample mean, or the sampling distribution of the sample mean is simply univariate normal with mean $\mu$ and standard deviation $\sigma/\sqrt{n}$; i.e., $$\bar x \sim \operatorname{Normal}(\mu, \sigma/\sqrt{n}).$$ But since this is not the only possible statistic that estimates $\mu$, let alone the only statistic of interest in general, there exist other sampling distributions; how these might be distributed depends on the statistic that is derived from the sample.