The question is simply what is stated in the title: When does the law of large numbers fail? What I mean is, in what cases will the frequency of an event not tend to the theoretical probability?
1 Answers
There are two theorems (of Kolmogorov) and both require that the expected value be finite. The first holds when variables are IID, the second, when sampling is independent and the variance of the $X_n$ satisfies
$$\sum_{n=1}^\infty \frac{V(X_n)}{n^2} < \infty$$
Say that all $X_n$ have expected value 0, but their variance is $n^2$ so that the condition obviously fails. What happens then? You can still compute an estimated mean, but that mean will not tend to 0 as you sample deeper and deeper. It will tend to deviate more and more as you keep sampling.
Let's give an example. Say that $X_n$ is uniform $U(-n2^n , n2^n)$ so that the condition above fails epically.
$$\sum_{n=1}^\infty \frac{V(X_n)}{n^2} = \sum_{n=1}^\infty \frac{n^2 2^{2n+2}}{12}\frac{1}{n^2} = \frac{1}{3} \sum_{n=1}^\infty 4^n = \infty.$$
By noting that
$$\bar{X}_n = \frac{X_n}{n} + \frac{n-1}{n}\bar{X}_{n-1},$$
we see by induction that the computed average $\bar{X}_n$ is always within the interval $(-2^n, 2^n)$. By using the same formula for $n+1$, we also see that there is always a chance greater than $1/8$ that $\bar{X}_{n+1}$ lies outside $(-2^n, 2^n)$. Indeed, $\frac{X_{n+1}}{n+1}$ is uniform $U(-2^{n+1},2^{n+1})$ and lies outside $(-2^n, 2^n)$ with probability $1/4$. On the other hand, $\frac{n}{n+1}\bar{X}_n$ is in $(-2^n, 2^n)$ by induction, and by symmetry it is positive with probability $1/2$. From these observations it follows immediately that $\bar{X}_{n+1}$ is greater than $2^n$ or smaller than $-2^n$, each with a probability larger than $1/16$. Since the probability that $|\bar{X}_{n+1}| > 2^n$ is greater than $1/8$, there cannot be convergence to 0 as $n$ goes to infinity.
Now, to specifically answer your question, consider an event $A$. If I understood well, you ask "in what conditions is the following statement false?"
$$ \lim_{n \rightarrow \infty} \frac{1}{n}\sum_{k = 1}^{n} 1_A(X_k) = P(X \in A), \; [P]\;a.s.$$
where $1_A$ is the indicator function of the event $A$, i.e. $1_A(X_k) = 1$ if $X_k \in A$ and $0$ otherwise and the $X_k$ are identically distributed (and distributed like $X$).
We see that the condition above will hold, because the variance of an indicator function is bounded above by 1/4, which is the maximum variance of a Bernouilli 0-1 variable. Still, what can go wrong is the second assumption of the strong law of large numbers, namely independent sampling. If the random variables $X_k$ are not sampled independently then convergence is not ensured.
For example, if $X_k$ = $X_1$ for all $k$ then the ratio will be either 1 or 0, whatever the value of $n$, so convergence does not occur (unless $A$ has probability 0 or 1 of course). This is a fake and extreme example. I am not aware of practical cases where convergence to the theoretical probability will not occur. Still, the potentiality exists if sampling is not independent.

- 13,383
- 2
- 44
- 89
-
One comment. On wikipedia (lnl page) i have read that the non finiteness of variance only decelerate the convergence of the mean value. Is different from what you states? – emanuele Jun 06 '12 at 14:13
-
2Are you two discussing the same law? The question asks about *frequencies of events* while this reply seems to focus on the sampling distribution of a *mean*. Although there is a connection, it hasn't yet appeared explicitly here as far as I can tell. – whuber Jun 06 '12 at 14:53
-
@whuber True. I focused too much on the title of the question. Thanks for pointing. I updated the answer. – gui11aume Jun 06 '12 at 15:46
-
@gui11aume i don't understand "We see that the condition above will hold, because the variance of an indicator function is bounded above by 1/4.". What does it means? – emanuele Jun 06 '12 at 16:39
-
It is not clear what $1_A(X_k)$ means or how it relates to $P(A)$. If the $X_k$ are **iid** (that is, not *just* independent), then we might interpret the former as $1_{(X_k \in A)}$ and the latter as $P(X_1 \in A)$, but otherwise... – cardinal Jun 06 '12 at 16:42
-
Thanks for the recent edit. I'd still suggest you clarify that the limiting statement only makes sense if the $X_k$ are iid and also that the right-hand side should be $P(X \in A)$ and not $P(A)$. – cardinal Jun 06 '12 at 17:07
-
@cardinal, yup! I was thinking about the meaning of the terms in non IID case but could not come with anything satisfactory. – gui11aume Jun 06 '12 at 17:09
-
@cardinal, after giving it a thought, I think that they just need to be identically distributed for the formula to make sense. I edited accordingly because it fits better with what comes after. – gui11aume Jun 06 '12 at 17:17
-
@emanuele I edited the answer to clarify. 1/4 is the max variance of a Bernouilli 0-1 variable (its variance is $p(1-p)$). – gui11aume Jun 06 '12 at 17:19
-
1If they are identically distributed, but not independent, the limit in question may not exist at all. – cardinal Jun 06 '12 at 17:32