19

This is essentially a replication of a question I found over at math.se, which didn't get the answers I hoped for.

Let $\{ X_i \}_{i \in \mathbb{N}}$ be a sequence of independent, identically distributed random variables, with $\mathbb{E}[X_i] = 1$ and $\mathbb{V}[X_i] = 1$.

Consider the evaluation of

$$ \lim_{n \to \infty} \mathbb{P}\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \leq \sqrt{n}\right) $$

This expression has to be manipulated since, as is, both sides of the inequality event tend to infinity.

A) TRY SUBTRACTION

Before considering the limiting statement, subtract $\sqrt{n}$ from both sides:

$$\lim_{n \to \infty} \mathbb{P}\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i -\sqrt{n} \leq \sqrt{n}-\sqrt{n} \right) = \lim_{n \to \infty} \mathbb{P}\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n (X_i - 1) \leq 0\right) \\ = \Phi(0) = \frac{1}{2}$$

the last equality by the CLT, where $\Phi()$ is the standard normal distribution function.

B) TRY MULTIPLICATION

Multiply both sides by $1/\sqrt{n}$ $$\lim_{n \to \infty} \mathbb{P}\left(\frac {1}{\sqrt{n}}\cdot \frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \leq \frac {1}{\sqrt{n}}\cdot\sqrt{n} \right) = \lim_{n \to \infty} \mathbb{P}\left(\frac{1}{n} \sum_{i=1}^n X_i \leq 1\right) $$

$$= \lim_{n \to \infty} \mathbb{P}\left(\bar X_n \leq 1\right) = \lim_{n \to \infty}F_{\bar X_n}(1) = 1$$

where $F_{\bar X_n}()$ is the distribution function of the sample mean $\bar X_n$, which by the LLN converges in probability (and so also in distribution) to the constant $1$, hence the last equality.

So we get conflicting results. Which is the right one? And why the other is wrong?

Glorfindel
  • 700
  • 1
  • 9
  • 18
Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • A link to the math.se question would perhaps be useful – Juho Kokkala Jun 25 '18 at 07:06
  • 1
    @JuhoKokkala Sure, here it is, https://math.stackexchange.com/q/2830304/87400 Just ignore the OP's mistake there. – Alecos Papadopoulos Jun 25 '18 at 07:07
  • 2
    I think the problem is in the second statement invoking the LLN – Glen_b Jun 25 '18 at 07:46
  • @Glen_b Simulation of the original statement (sum divided by square root), suggests the opposite though. – Alecos Papadopoulos Jun 25 '18 at 09:18
  • 3
    I followed you right up to the final equality. It's clearly wrong, because we would expect $\mathbb{P}(\bar X_n\le 1)$ to approximate $1/2$ for large $n$ and therefore its limit should not equal $1.$ What is the intended justification of it? It is not the statement of any version of a law of large numbers that I know. – whuber Jun 25 '18 at 12:35
  • 1
    @whuber Supposedly, that all probability for the sample mean concentrates to the value $1$. If this is wrong, I believe it is important for the mistake to be detailed out in an answer, that's the purpose of this question. – Alecos Papadopoulos Jun 25 '18 at 12:40
  • 1
    I'm baffled by it because there is no statement of a law of large numbers that implies your conclusion, and it's unclear what reasoning you are trying to invoke to justify it. You could clarify your question by (a) quoting, in detail, the theorem you believe you are applying and (b) explaining why you think it applies in this circumstance. – whuber Jun 25 '18 at 13:00
  • 1
    @ whuber But the "argument" is already written: the sample mean converges in probability and so also in distribution to a constant. Why more than that is needed, in order for someone to show that it is wrong? – Alecos Papadopoulos Jun 25 '18 at 13:25
  • I don't think the $\Phi(0)$ step applies. The previous step seems undefined, not 0. You're effectively arguing that $ P( \infty \cdot 0 \leqslant 0) \equiv \Phi(0) $ which it's not. Hence the subtraction variant is unhelpful. – Tasos Papastylianou Jun 25 '18 at 13:54
  • I also disagree with your second conclusion. In my view $$ \lim_{n \to \infty} P ( \bar X_n \leq 1 ) = P( \mathbb{E}[X] \leq 1) $$, and according to CLT, $\mathbb{E}[X]$ is gaussian with $\mu = 1$, therefore that probability should equal 1/2. – Tasos Papastylianou Jun 25 '18 at 14:08
  • Having said all that, if you formulate the subtraction variant in a way that expresses $ \mathbb{E}[X] $, then you get the same conclusion of 1/2, so the two are consistent in that sense. – Tasos Papastylianou Jun 25 '18 at 14:12
  • Tasos Papastylianou, if you would follow your logic, then you would obtain $P(1\leq 1) =1$ and not 1/2. The point is, that the last equality from Alecos Papadopoulos is wrong. We have $P(\overline{X}_n \leq x ) \to 1$ for $x>0$, and $P(\overline{X}_n \leq x ) \to 0$ for $x<0$ while for $x = 0$ (discontinuity point of the distribution of $Y\equiv 1$) we have $P(\overline{X}_n \leq x ) \to 0.5$. For an inutitive example let $X_1,\dots, X_n$ iid $N(1,1)$ such that $\overline{X}_n \sim N(1,1/n)$ exactly; then it clearly holds that $P(\overline{X}_n\leq 1) \to 0.5$. – chRrr Jun 25 '18 at 14:26
  • 1
    @chRrr you only get $ P(1 \leqslant 1) $ if you take out $E[X]$ as if it were undistributed and _necessarily_ equal to $1$. Which is the case for a strong LLN, as I understand it, and in a CLT is equivalent to saying the variance of $E[X]$ has collapsed to 0 making its distribution a delta rather than a true gaussian. So as I understand it, the correct interpretation of all the above depends on whether one is attacking this in a strong / analytical manner, or a weak / numerical one. – Tasos Papastylianou Jun 25 '18 at 14:31
  • I suspect if Alecos is attempting to confirm this numerically for n = "a very large number effectively infinite", he is getting p = 1/2 for any approach that tests exactly how much of the distribution is <= 1 , and p = 1 for any approach that tests if the entire distribution lies within a very small threshold left and right of 1. – Tasos Papastylianou Jun 25 '18 at 14:34
  • E(X) is not a random variable. (and I made a mistake in my comment. here the cases are x<1, x=1, x>1 instead of x<0, x=0, x>0). – chRrr Jun 25 '18 at 14:43
  • 1
    @alecos my statement was consistent with my subsequent simulations (I checked before I commented); the argument based on the LLN was not consistent with simulations – Glen_b Jun 25 '18 at 14:45
  • 2
    Alecos, my concern isn't whether the final step is wrong: it concerns *your reasons for making it.* Isn't that after all what the question is about? I still haven't read anything from you giving those reasons and I would hesitate even to guess what they might be. Although you refer to an "LLN," I believe the resolution of your problem is likely to lie in describing precisely what you understand "LLN" to assert. – whuber Jun 25 '18 at 16:14
  • 1
    @whuber It is well known what the LLN asserts _verbally_: that, in our case, "$X_n$ converges in probability to $1$". Then, we learn that "convergence in probability" implies "convergence in distribution". It is this chain of implications, applied hurriedly, that leads to the apparently correct and so conflicting result of approach B). The goal here is for an answer to show how, where, and why, the mistake in this 2nd approach fails. This would be valuable, because it is likely that the everyday user of basic limiting theory may try this 2nd approach in order to evaluate the initial limit. – Alecos Papadopoulos Jun 25 '18 at 18:12
  • @Glen_b Apologies for the teaser, I was _surely_ (not just almost surely) certain that you have done your simulations before commenting. Mine was not extended enough and unreliably indicative. – Alecos Papadopoulos Jun 25 '18 at 18:17
  • 1
    It would be nice if you would edit the post to include those previous comments. BTW, I'm not so sure about your claim that some (unstated until just now) "verbal" account of a theorem is "well known." It's easy to find support for assertions that certain *careful mathematical statements* of convergence laws are "well known"--or at least widely published and quoted--but much harder to demonstrate that (incorrect) paraphrasings have much currency. – whuber Jun 25 '18 at 18:36
  • You are conflating the limit of the sequence of probabilities and the probability of the limiting value being $\leq 1/2$. Both statements can be true; the probability that the limiting value is $\leq 1/2$ can be equal to $1$ while the limit of the sequence of probabilities can be equal to $1/2$. – jbowman Jun 26 '18 at 00:47
  • Can we speak of the probability $Pr(\frac{1}{n} \sum_{i=1}^{n} (\bar{X}_i))$ being well defined for $n \to \infty$? – Sextus Empiricus Jun 26 '18 at 19:23
  • @MartijnWeterings Certainly. Its probabilities are given by the distribution function of a constant rv for the points of continuity, and the probability for the point of discontinuity is given by the CLT approach. That "taken together" these are expressed by a step-wise function that does not satisfy all the properties of a distribution function, does not imply that the probabilities are not well defined. – Alecos Papadopoulos Jun 26 '18 at 19:52
  • @MartijnWeterings See this answer of mine for an informal discussion on the properties of a CDF. https://stats.stackexchange.com/a/253121/28746. – Alecos Papadopoulos Jun 26 '18 at 21:00
  • 1
    @MartijnWeterings In our a bit weird case, the 2nd half of probability mass arises in connection to a number that we cannot write -namely the "closest real to the value 1 from the right". We cannot write it -but it exists. – Alecos Papadopoulos Jun 26 '18 at 21:08
  • I am still a bit confused because there are so many descriptions out there. One popular view is https://en.wikipedia.org/wiki/Normal_distribution#Zero-variance_limit which relates to the question without thinking about the LLN (we can just as well directly consider $\bar{X}_n \sim N(1,\frac{1}{n})$). According to that view you have $lim_{n \to \infty} P(\bar{X}_n - 1 \leq 0) = 1$ while using the CLT expression you get $P(\sqrt{n} (\bar{X}_n - 1) \leq 0) = 0.5$. This makes me wonder whether $\bar{X}_n - 1 \leq 0$ is a well defined event for $n \to \infty$ – Sextus Empiricus Jun 26 '18 at 22:04

5 Answers5

16

The error here is likely in the following fact: convergence in distribution implicitly assumes that $F_n(x)$ converges to $F(x)$ at points of continuity of $F(x)$. As the limit distribution is of a constant random variable, it has a jump discontinuity at $x=1$, hence it is incorrect to conclude that the CDF converges to $F(x)=1$.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Alex R.
  • 13,097
  • 2
  • 25
  • 49
  • 1
    The way we define convergence in disribution does not _exclude_ the possibility of convergence at the points of discontinuity -it just doesn't _require_ it. – Alecos Papadopoulos Jun 25 '18 at 09:14
  • 1
    But if convergence in distribution does not require $F_n(1)$ to converge to $F(1)$, what is the last equality in the question based on? – Juho Kokkala Jun 25 '18 at 17:00
  • 1
    @Juho It's not based on anything--that's the crux of the matter. *There is no theorem that allows one to make the last equation in the question.* – whuber Jun 25 '18 at 18:37
  • 1
    @AlecosPapadopoulos: I never said that it doesn't exclude the possibility. I'm implicitly saying that you need to justify the last equality beyond what is given to you from convergence in distribution. For example if $X_n$ is Bernoulli, then it would be true. – Alex R. Jun 25 '18 at 18:45
11

For iid random variables $X_i$ with $E[X_i]= \operatorname{var}(X_i)=1$ define \begin{align}Z_n &= \frac{1}{\sqrt{n}}\sum_{i=1}^n X_i,\\ Y_n &= \frac{1}{{n}}\sum_{i=1}^n X_i. \end{align} Now, the CLT says that for every fixed real number $z$, $\lim_{n\to\infty} F_{Z_n}(z) = \Phi(z-1)$. The OP applies the CLT to evaluate $$\lim_{n\to\infty}P\left(Z_n \leq \frac{1}{\sqrt{n}}\right) = \Phi(0) = \frac 12.$$

As the other answers as well as several of the comments on the OP's question have pointed out, it is the OP's evaluation of $\lim_{n\to\infty} P(Y_n \leq 1)$ that is suspect. Consider the special case when the iid $X_i$ are discrete random variables taking on values $0$ and $2$ with equal probability $\frac 12$. Now, $\sum_{i=1}^n X_i$ can take on all even integer values in $[0,2n]$ and so when $n$ is odd, $\sum_{i=1}^n X_i$ cannot take on value $n$ and hence $Y_n = \frac 1n \sum_{i=1}^n X_i$ cannot take on value $1$. Furthermore, since the distribution of $Y_n$ is symmetric about $1$, we have that $P(Y_n \leq 1) = F_{Y_n}(1)$ has value $\frac 12$ whenever $n$ is odd. Thus, the sequence of numbers $$P(Y_1 \leq 1), P(Y_2 \leq 1), \ldots, P(Y_n \leq 1), \ldots$$ contains the subsequence $$P(Y_1 \leq 1), P(Y_3 \leq 1), \ldots, P(Y_{2k-1} \leq 1), \ldots$$ in which all the terms have value $\frac 12$. On the other hand, the subsequence $$P(Y_2 \leq 1), P(Y_ 4\leq 1), \ldots, P(Y_{2k} \leq 1), \ldots$$ is converging to $1$. Hence, $\lim_{n\to\infty} P(Y_n \leq 1)$ does not exist and claims of convergence of $P(Y_n\leq 1)$ to 1 must be viewed with a great deal of suspicion.

Dilip Sarwate
  • 41,202
  • 4
  • 94
  • 200
8

Your first result is the correct one. Your error occurs in the second part, in the following erroneous statement:

$$\lim_{n \rightarrow \infty} F_{\bar{X}_n}(1) = 1.$$

This statement is false (the right-hand-side should be $\tfrac{1}{2}$) and it does not follow from the law of large numbers as asserted. The weak law of large numbers (which you invoke) says that:

$$\lim_{n \rightarrow \infty} \mathbb{P} \Big( |\bar{X}_n - 1| \leqslant \varepsilon \Big) = 1 \quad \quad \text{for all } \varepsilon > 0.$$

For all $\varepsilon > 0$ the condition $|\bar{X}_n - 1| \leqslant \varepsilon$ spans some values where $\bar{X}_n \leqslant 1$ and some values where $\bar{X}_n > 1$. Hence, it does not follow from the LLN that $\lim_{n \rightarrow \infty} \mathbb{P} ( \bar{X}_n \leqslant 1 ) = 1$.

Ben
  • 91,027
  • 3
  • 150
  • 376
  • 1
    The (erroneous indeed) result comes from the implication "convergence in probability implies convergence in distribution". The question does not state that the assertion comes _directly_ from the LLN. – Alecos Papadopoulos Jun 26 '18 at 00:31
  • @AlecosPapadopoulos: Convergence in probability *does* imply convergence in distribution. Again, convergence in distribution is required only at points of continuity. But, maybe you meant convergence in probability does not implies *pointwise* convergence of distribution. – Alex R. Jun 26 '18 at 17:34
  • @AlexR. I am not sure where your objection lies. I believe this issue is covered in my own answer. – Alecos Papadopoulos Jun 26 '18 at 17:36
3

Convergence in probability implies convergence in distribution. But... what distribution? If the limiting distribution has a jump discontinuity then the limits become ambiguous (because multiple values are possible at the discontinuity).

where $F_{\bar X_n}()$ is the distribution function of the sample mean $\bar X_n$, which by the LLN converges in probability (and so also in distribution) to the constant $1$,

This is not right, and it is also easy to show that it can not be right (different from the disagreement between CLT and LLN). The limiting distribution (which can be seen as the limit for a sequence of normal distributed variables) should be:

$$F_{\bar{X}_\infty}(x) = \begin{cases} 0 & \text{for } x<1 \\ 0.5& \text{for } x=1\\ 1 & \text{for } x>1 \end{cases}$$

for this function you have that, for any $\epsilon>0$ and every $x$, the difference $|F_{\bar{X}_n}(x)-F_{\bar{X}_\infty}(x)|<\epsilon$ for sufficiently large $n$. This would fail if $F_{\bar{X}_\infty}(1)=1$ instead of $F_{\bar{X}_\infty}(1)=0.5$


Limit of a normal distribution

It may be helpful to explicitly write out the sum used to invoke the law of large numbers.

$$\bar{X}_n=\frac{1}{n}\sum_{i=1}^n X_i \sim N(1,\frac{1}{n}) $$

The limit $n\to \infty$ for $\hat{X}_n$ is actually equivalent to the Dirac Delta function when it is represented as the limit of the normal distribution with the variance going to zero.

Using that expression it is more easy to see what is going on under the hood, rather than using the ready-made laws of the CLT an LLN which obscure the reasoning behind the laws.


Convergence in probability

The law of large numbers gives you 'convergence in probability'

$$\lim_{n \to \infty} P(|\bar{X}_n-1|>\epsilon) =0 $$

with $\epsilon > 0$

An equivalent statement could be made for the central limit theorem with $\lim_{n \to \infty} P(|\frac{1}{\sqrt{n}}\sum \left( X_i-1 \right)|>\frac{\epsilon}{n}) =0 $

It is wrong to state that this implies $$\lim_{n \to \infty} P(|\bar{X}_n-1|>0) =0 $$

It is less nice that this question is cross-posted so early (confusing, yet interesting to see the different discussions/approaches math vs stats, so not that too bad). The answer by Michael Hardy on the math stackexchange deals with it very effectively in terms of the strong law of large numbers (the same principle as the accepted answer from drhab in the cross posted question and Dilip here). We are almost sure that a sequence $\bar{X}_1, \bar{X}_2, \bar{X}_3, ... \bar{X}_n$ converges to 1, but this does not mean that $\lim_{n \to \infty} P(\bar{X}_n = 1)$ will be equal to 1 (or it may not even exist as Dilip shows). The dice example in the comments by Tomasz shows this very nicely from a different angle (instead of the limit not existing, the limit goes to zero). The mean of a sequence of dice rolls will converge to the mean of the dice but the probability to be equal to this goes to zero.


Heaviside step function and Dirac delta function

The CDF of $\bar{X}_n$ is the following:

$$F_{\bar{X}_n}(x) = \frac{1}{2} \left(1 + \text{erf} \frac{x-1}{\sqrt{2/n}} \right)$$

with, if you like, $\lim_{n \to \infty} F_{\bar{X}_n}(1) = 0.5$ (related to the Heaviside step function, the integral of the Dirac delta function when viewed as the limit of normal distribution).


I believe that this view intuitively resolves your question regarding 'show that it is wrong' or at least it shows that the question about understanding the cause of this disagreement of CLT and LLN is equivalent to the question of understanding the integral of the Dirac delta function or a sequence of normal distributions with variance decreasing to zero.

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • 2
    Your limiting distribution is in fact not a distribution at all. A CDF must be right continuous, whereas it clearly is not at $x=1/2$. – Alex R. Jun 26 '18 at 15:56
  • The right continuity seems to be necessary such that for every $a$ we have $\lim_{n \to \infty} F_{X}(a+\frac{1}{n}) = F_{X}(a)$ as the events $X \leq a+\frac{1}{n}$ are nested we should have $$\lim_{n \to \infty} F_{X}(a+\frac{1}{n}) = \lim_{n \to \infty} P(X \leq a+\frac{1}{n}) = P(\lim_{n \to \infty} X \leq a+\frac{1}{n}) = P(X \leq a) = F_{X}(a)$$ but is this true for our case and where is the catch? Is this right continuity necessary based on probability axioms or is it just a convention such that the CDF works for most common cases? – Sextus Empiricus Jun 26 '18 at 22:36
  • @Martin Weterings: This is precisely where it comes from. Any valid measure $P$ must satisfy these monotonicity results. They are a consequence of the boundedness of $P$ along with countable additivity. More generally, a function $F(x)$ is a CDF (i.e. corresponds to some distribution $P$ via $F(b)-F(a)=P(a – Alex R. Jun 26 '18 at 23:17
2

I believe it should be clear by now that "the CLT approach" gives the right answer.

Let's pinpoint exactly where the "LLN approach" goes wrong.

Starting with the finite statements, it is clear then that we can equivalently either subtract $\sqrt{n}$ from both sides, or multliply both sides by $1/\sqrt{n}$. We get

$$\mathbb{P}\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \leq \sqrt{n}\right)=\mathbb{P}\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n(X_i-1) \leq 0\right) = \mathbb{P}\left(\frac{1}{n} \sum_{i=1}^nX_i \leq 1\right)$$

So if the limit exists, it will be identical. Setting $Z_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n(X_i-1)$, we have, using distribution functions

$$\mathbb{P}\left(\frac{1}{\sqrt{n}} \sum_{i=1}^n X_i \leq \sqrt{n}\right)= F_{Z_n}(0) = F_{\bar X_n}(1)$$

...and it is true that $\lim_{n\to \infty}F_{Z_n}(0)= \Phi(0) = 1/2$.

The thinking in the "LLN approach" goes as follows: "We know from the LLN that $\bar X_n$ converges in probabililty to a constant. And we also know that "convergence in probability implies convergence in distribution". So, $\bar X_n$ converges in distribution to a constant". Up to here we are correct.
Then we state: "therefore, limiting probabilities for $\bar X_n$ are given by the distribution function of the constant at $1$ random variable",

$$F_1(x) = \cases {1 \;\;\;\;x\geq 1 \\ 0 \;\;\;\;x<1} \implies F_1(1) = 1$$

... so $\lim_{n\to \infty} F_{\bar X_n}(1) = F_1(1) = 1$...

...and we just made our mistake. Why? Because, as @AlexR. answer noted, "convergence in distribution" covers only the points of continuity of the limiting distribution function. And $1$ is a point of discontinuity for $F_1$. This means that $\lim_{n\to \infty} F_{\bar X_n}(1)$ may be equal to $F_1(1)$ but it may be not, without negating the "convergence in distribution to a constant" implication of the LLN.

And since from the CLT approach we know what the value of the limit must be ($1/2$). I do not know of a way to prove directly that $\lim_{n\to \infty} F_{\bar X_n}(1) = 1/2$.

Did we learn anything new?

I did. The LLN asserts that

$$\lim_{n \rightarrow \infty} \mathbb{P} \Big( |\bar{X}_n - 1| \leqslant \varepsilon \Big) = 1 \quad \quad \text{for all } \varepsilon > 0$$

$$\implies \lim_{n \rightarrow \infty} \Big[ \mathbb{P} \Big( 1-\varepsilon <\bar{X}_n \leq 1\Big) + \mathbb{P} \Big( 1 <\bar{X}_n \leq 1+\varepsilon\Big)\Big] = 1$$

$$\implies \lim_{n \rightarrow \infty} \Big[ \mathbb{P} \Big(\bar{X}_n \leq 1\Big) + \mathbb{P} \Big( 1 <\bar{X}_n \leq 1+\varepsilon\Big)\Big] = 1$$

The LLN does not say how is the probability allocated in the $(1-\varepsilon, 1+\varepsilon)$ interval. What I learned is that, in this class of convergence results, the probability is at the limit allocated equally on the two sides of the centerpoint of the collapsing interval.

The general statement here is, assume

$$X_n\to_p \theta,\;\;\; h(n)(X_n-\theta) \to_d D(0,V)$$

where $D$ is some rv with distribution function $F_D$. Then

$$\lim_{n\to \infty} \mathbb P[X_n \leq \theta] = \lim_{n\to \infty}\mathbb P[h(n)(X_n-\theta) \leq 0] = F_D(0)$$

...which may not be equal to $F_{\theta}(0)$ (the distribution function of the constant rv).

Also, this is a strong example that, when the distribution function of the limiting random variable has discontinuities, then "convergence in distribution to a random variable" may describe a situation where "the limiting distribution" may disagree with the "distribution of the limiting random variable" at the discontinuity points. Strictly speaking, the limiting distribution for the continuity points is that of the constant random variable. For the discontinuity points we may be able to calculate the limiting probability, as "separate" entities.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • The 'lesson learned' perspective is interesting, and this is a good, not too difficult, example for didactic application. Although I wonder what kind of (direct) practical application this thinking about the infinite has, because eventually in practice $n \neq \infty$ – Sextus Empiricus Jun 26 '18 at 12:56
  • @MartijnWeterings Martijn, the motivation here was certainly educational, a) as an alert to discontinuities even in such a "flat" situation as the convergence to a constant, and so also in general (they destroy uniform convergence for example), and b) a result on how the probability mass is allocated becomes interesting when the sequence that converges in probabilty to a constant, still has a non-zero variance. – Alecos Papadopoulos Jun 26 '18 at 13:00
  • We could say that CLT let's as say something about convergence to a limiting normal distributed variable (thus being able to express such things as $F(x)$), but LLN only allows us to say that, by increasing the sample size, we get closer to the true mean, but this does not say that we get, with higher probability, 'exactly equal to the sample mean'. LLN means that the sample mean gets closer and closer to a limiting value but not (with higher probability) equal to it. LLN says nothing about $F(x)$ – Sextus Empiricus Jun 26 '18 at 13:24
  • The original thoughts around the LLN where actually opposite (see the reasoning of Arbuthnot https://stats.stackexchange.com/questions/343268/). *"It is visible from what has been said, that with a very great Number of Dice, A’s Lot would become very small... there would be but a small part of all the possible Chances, for its happening at any assignable time, that an equal Number of Males and Females should be born."* – Sextus Empiricus Jun 26 '18 at 13:25