3

Kabacoff 2015 suggests that if we're not willing to assume the sampling distribution of the mean is normally distributed, we should use bootstrapping to estimate the sampling distribution of the mean.

But hang on a minute, one of the first things we're told in many statistics texts is that the sampling distribution of the mean is normally distributed. How often will the sampling distribution of the mean not be normally distributed?

luciano
  • 12,197
  • 30
  • 87
  • 119
  • 3
    Note that what we're told is that the sampling distribution of the mean approximates better & better to a normal distribution as the sample size increases - at least in any good text. It's called the central limit theorem. Many texts would also add the qualification that sampling must be from a distribution whose mean & variance exist for this to hold. So see [Example of distribution where large sample size is necessary for central limit theorem](http://stats.stackexchange.com/q/61798/17230) & [Why does the Cauchy distribution have no mean?](http://stats.stackexchange.com/q/61798/17230). – Scortchi - Reinstate Monica Jun 07 '15 at 14:47
  • 3
    Play around with this simulation (temperamental with different versions of Java) and you will see that how soon you approach normality depends on how weird the initial distribution is and how large the sample is. http://onlinestatbook.com/stat_sim/sampling_dist/index.html (@Glen_b and @Scortchi provide more technical answers.) – zbicyclist Jun 08 '15 at 04:19

1 Answers1

3

one of the first things we're told in many statistics texts is that the sampling distribution of the mean is normally distributed

Well, no, we're usually told something different to that.

[If you can find a reference that actually says what you said, I can show why they're wrong easily enough. But most texts don't say that.]

In practice, when is the sampling distribution of the mean actually normal? For iid random variables, I think the sample mean is only going to be actually normal when the individual components are .

How often will the sampling distribution of the mean not be normally distributed?

Well, in truth, it's probably never actually normally distributed. However, often the distribution of sample means will be very well approximated by a normal.


The obvious next question is "How often will it be 'close enough'?".

That depends on
(i) your circumstances (some applications - almost never; other applications, quite often);
(ii) your sample size (with really large samples you'll see it more often than if your typical sample sizes are small); and
(iii) on your tolerance for deviation from normality (i.e. how close is close for you? -- we can't tell you that).

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Regarding 'For iid random variables, I think the sample mean is only going to be actually normal when the individual components are .' I thought the sampling distribution of the mean will be normal even if the random variable from which the mean is calculated is non-normal? – luciano Jun 08 '15 at 07:32
  • 1
    Regarding 'Well, in truth, probably never'. Should that read 'Well, in truth, probably always'? – luciano Jun 08 '15 at 07:33
  • @luciano: "I thought the sampling distribution of the mean will be normal even if the random variable from which the mean is calculated is non-normal?" - even if the sample size is one? – Scortchi - Reinstate Monica Jun 08 '15 at 08:00
  • @luciano Yes, thanks for that. I've edited to make it fit with what the actual question said. – Glen_b Jun 08 '15 at 08:55