0

"The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed." (from sphweb.bumc.bu.edu)

Reviewing the underlying fundamentals, I got puzzled by an example that attempts to explain how the CLT is formed. More specifically, I wonder why the "large random samples" here should be selected by the "permutation with repetition" method, not the "combination" method. In this example (from a class I'm taking. Not a perfect example since the sample size is not large), we suppose there's a finite population with 8 numbers

54, 55, 59, 63, 64, 68, 69, 70

Then it said all possible samples of n=2 will be 64. This is so because we can do 8C2*2+8 or 8^2

And all the samples will look like this

enter image description here

How do we understand the “sampling with replacement” here? Obviously the order matters here, but why does it?

When order matters, 2 samples with the same elements but different orders will produce the same means. For example, (55,57) and (57,55) both produce sample means of 56. Why would we want the same mean twice?

Peiran Yu
  • 61
  • 5
  • 1
    Your quote says "large random samples". Your example has samples sized $2$, which is not usually regarded as large. But your example does show the $8^2=64$ equally likely samples sized $2$. Each of the $8$ values is equally likely to be the first sample, and independently each of the $8$ values is equally likely to be the second sample. The independence of each sample, together with a large sample size, help with the Central Limit Theorem. Not allowing repetitions or allowing repetitions but making order not matter would remove independence – Henry Feb 10 '22 at 16:41
  • Henry, you are right that the example isn't the best since the sample size here is 2 and unacceptably small. This is an example I found from a textbook, I guess it's only for demonstration purposes. – Peiran Yu Feb 23 '22 at 08:24
  • I'm glad you mentioned the "independence" aspect as part of the CLT requirement. I believe this is the key. Thank you! (vs. the answer from the other places told me that "we needed to exhaust all possible combinations of sampling). – Peiran Yu Feb 23 '22 at 08:26

1 Answers1

0

It would be nice to see the full example to answer with more precision, but I'm thinking that there is too much stress on the permutation part. For example, you are correct in noting that ${55,57}$ and ${57,55}$ give us the same average, but is it true that you will always retrieve these two instances of the population during random sampling? The interesting question for me trying to understand CLT is how the number of samples impacts the statistic under consideration, e.g., i might like to know how 'off' the mean of the first sample is from the actual mean (since in this case we know the entire population) and how many more samples are needed to approach the true mean. I'd also wonder how many instances should make up a sample...

wellplayed
  • 101
  • 2