3

I have read the answer here. Here the distinction is that

  • If $n\to\infty$ and $p\to0$ while $np$ approaches some positive number $\lambda,$ then the binomial distribution approaches a Poisson distribution with expected value $\lambda.$

  • If $n\to\infty$ as $p$ stays fixed, and $X\sim\operatorname{Binomial}(n,p)$ then the distribution of $(X-np)/\sqrt{np(1-p)}$ approaches the standard normal distribution, i.e. the normal distribution with expected value $0$ and standard deviation $1.$

I am finding it hard to wrap my head around this. In the derivation of the central limit theorem nowhere is $p$ taken into consideration. So even if $p$ is very small, according to CLT the standardized Binomial should limit to a standard normal. And the two limiting behaviors are both for $n \to \infty$ Please help me understand this concept a bit more. How can CLT not be valid when $p$ is really small?

MiloMinderbinder
  • 1,622
  • 2
  • 15
  • 31
  • 1
    On first pass, my instinct is that ambiguity arises here due to insufficient precision in how the semantics of "approaches" is interpreted mathematically. – microhaus Apr 10 '21 at 12:17
  • The answer you reference clearly articulates the fundamental point, beginning with "It is sloppy to say something approaches something depending on n as n→∞, unless it is precisely defined and not meant literally...." – whuber Apr 10 '21 at 12:25
  • You appear to be making a common mistake about the central limit theorem: https://stats.stackexchange.com/questions/473455/debunking-wrong-clt-statement. The central limit theorem is about a sampling distribution, not about the original population. – Dave Apr 10 '21 at 12:35

2 Answers2

1

The difficulty disappears when you are careful in formulating the limits. In the first case, $p$ is not constant, so it would be more precise to write it as $p_n$, as $p$ varies with $n$. We can write $n \cdot p_n \to \lambda>0$ another way as $p_n \sim \lambda/n$, where $\sim$ means that the quotient between the two sides converges to unity with $n \to\infty$.

For the second case, $p$ is constant, and however small, when $n$ is large enough, $np$ is no longer small. The CLT is still valid when $p>0$ is small and constant.

To understand this better, you could try to use the CLT for IID variables in the first case. Write the binomial out as a sum of $n$ IID Bernoulli variables, as $X_n= B_1 + \dotsm + B_n$. Now check the assumptions of the CLT. You will find that the $B_1, B_2, \dotso, B_n$ all must have the same distribution, and that distribution should not depend on $n$. Is that the case?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
0

One can write the CLT as: $$ \frac{\sum_{i=1}^{n}X_i - n\mu}{\sigma\sqrt{n}} \stackrel{d}{\to} N(0,1) $$

If we consider each $X_i$ here as a Bernoulli random variable that are independent and identically distributed, recalling that the mean and variance of a Bernoulli random variable are $p$ and $p(1-p)$ respectively, we can rewrite this as

$$ \frac{\sum_{i=1}^{n}X_i - np}{\sqrt{np(1-p)}} \stackrel{d}{\to} N(0,1) $$

And then note that a binomial is simply a sum of n Bernoulli random variable. That is that for $X = X_1 + \dots + X_n$ where each $X_i \sim Bernoulli(p)$ we have $X \sim Binomial(n,p)$. So we can finally get the form of the CLT given in your question.

$$ \frac{X - np}{\sqrt{np(1-p)}} \stackrel{d}{\to} N(0,1) $$

This is true even when p is very small, so long as the bernoulli random variables are identically distributed. In the first case where the binomial converges to a poisson distribution, p is growing smaller as n goes to infinity, and so the requirement that the random variables be identically distributed is not satisfied.

James
  • 101
  • 3