1

Most of the introductory stats textbooks, treat the sampling distribution of the mean as a normal distribution when sampling is done without replacement and n/N > 0.1. They just use of the finite population correction to correct the variance of the sampling distribution which is perfectly clear. However, no justification is provided on why normality is assumed. The CLT holds for iid vars, which is not the case for sampling without replacement. Could you please provide a formal justification. Thanks a lot!

Examples taken from Weiers Intro to Business Stats enter image description here

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
Manos
  • 91
  • 2
  • @Xiaomi, do they simply apply CLT? I can imagine that a proof may be possible independent from CLT that shows that a normal distribution approximates the situation of sampling without replacement. – Sextus Empiricus Oct 12 '18 at 10:11
  • Thanks for the answer Martijn. The question you cite explains the FPC role. The normality assumption is the main focus of my question. – Manos Oct 12 '18 at 10:38
  • Your question is good +1 but I have voted to close this question, even though I am not sure whether the duplicate (speaking about binomials vs hypergeometric distributions, which I believe are slightly different, binary, cases) has a satisfactory answer. More thorough (and historic) derivations are given by Isserlis (https://www.jstor.org/stable/2340569) and Edgeworth (https://www.jstor.org/stable/2340659) both from 1918. – Sextus Empiricus Oct 12 '18 at 10:39
  • The second point of the linked question *"How was the formula derived?"* relates to the normality assumption (although the answer is not given). – Sextus Empiricus Oct 12 '18 at 10:42
  • Where did you get $n/N > 0.1$? I think it should be $n/N < 0.1$. Then because the sampling fraction is low ==> correlation is low such that can be ignore ==> independent. – user158565 Oct 13 '18 at 01:36
  • @a_statistician if n/N is small then you can use the regular expression. for large n/N the correlations play a stronger role and you need to use the correction factor. – Sextus Empiricus Oct 13 '18 at 09:22
  • Yes, it is what I mean. If $n = N-1$, sample mean just has two values, how can it be normal distributed!? – user158565 Oct 13 '18 at 17:41

0 Answers0