3

I don't understand how the CLT can hold for a uniform distribution. Say I have U[0;1], then whatever value I will be able to sample from the population will always be 1. Therefore, every sample mean I can possibly get will also always be 1. Therefore, plotting many of these sample means should also result in a uniform distribution, as I don't have any values besides 1.

This is obviously wrong but I don't know why - I would really appreciate some clarification!

  • 2
    What does your notation `U[0; 1]` mean? It looks like the uniform distribution *between* zero and one, but then you say that every sample will be *exactly* one. I'm a bit confused as to which you mean...? – Matthew Drury Feb 05 '19 at 17:43
  • 2
    Are you perhaps confusing particular values of a probability *density* with the values that the random variable can have?? – whuber Feb 05 '19 at 19:46

4 Answers4

3

Uniform distribution takes value from $0$ to $1$, it is not deterministic as you described. You should get a value between $0$ and $1$ with mean being $0.5$.

In your case, $\sqrt{n}\left( \frac{\sum_{i=1}^n X_i}{n}-\mu\right)=\sqrt{n}\left( \frac{\sum_{i=1}^n X_i}{n}-\frac12\right)$ should converge to $N(0,\sigma^2).$

Siong Thye Goh
  • 6,431
  • 3
  • 17
  • 28
1

(This question is a particular case of a related question asking how it is possible for the CLT to apply to random variables with bounded support. (I recommend you also read the answer to that quesetion, since it relates closely to what you are talking about here.)

The uniform distribution is a distribution with bounded support, and in this particular case you have random variables $0 \leqslant U_i \leqslant 1$. Since each of the values falls within this interval, it is certainly true that the sample mean will also fall within this interval ---i.e., you must have:

$$0 \leqslant \bar{U}_n \leqslant 1.$$

As $n \rightarrow \infty$ the distribution of $\bar{U}_n$ will become narrower around its mean $\mathbb{E}(\bar{U}_n) = 1/2$, with the variance of the distribution approaching zero. This means that the probability region near the boundaries will shrink down towards zero. Moreover, from the CLT we know that the shape of the distribution will converge towards a normal distribution, notwithstanding the fact that the latter has a support that extends beyond the bounded interval that must contain the sample mean.

This may seem somewhat counter-intuitive. After all, for any $n \in \mathbb{N}$, the normal approximation to the true distribution of the sample mean will always put some non-zero probability on values that are outside the allowable bounds for that random variable, and so that means that the normal approximation is always giving some erroneous non-zero probability of impossible values. However, the CLT is an asymptotic result, so what matters for the theorem is that as $n \rightarrow \infty$ this erroneous probability will shrink towards zero.

Ben
  • 91,027
  • 3
  • 150
  • 376
0

I think you misunderstood the uniform distribution. The "uniform" part refers to probabilities, not to values. A uniformly distributed random variable can take many different values - but each value has the same probability associated with it (for a discrete uniform distribution, or constant density over the interval between values in the continuous case).

Thus the CLT works just like in other distributions. Samples will have different statistics, which form a normal distribution with the mean asymptotically close to the population mean.

Trivio
  • 86
  • 1
  • 7
  • 1
    Because a uniform distribution is continuous, by the very definition of continuous, the "same probability" you refer to is *zero.* – whuber Feb 05 '19 at 19:47
  • I was referring to the discrete uniform distribution to keep it short; but see edits. – Trivio Feb 05 '19 at 22:16
0

A random variable X (a uniform random variable, too) need 2 things. It needs a) a set of values $\{x_1, x_2, \cdots \}$ that $X$ can take on (this set is called the support) and b) it needs a function that expresses how often a realization $X=x_i$ happens relative to all of the possible realizations $x_j \ne x_i$. That is the pdf (or pmf) $f(x).$ For a uniform RV, we specify $X$ by $support = \{x| 0 \le x \le 1$} and $f(x)=1$.

If $X_1$ and $X_2$ are uniform RVs, then probability theory tells us that a new random variable can be defined as $Z=X_1+X_2$. The support of $Z$ is $0 \le z \le 2$ and details of $f(z)$ can be gotten from calculus.

The miraculous thing is that if we have at least $N = 30$ random variables (with almost any underlying distribution) $Z=\frac{X_1+X_2+\cdots+X_N}{N}$ is approximately a Normal distribution. Yes, it is OK to be amazed by that. There is a proof (and so it is "obvious" by Feynman's Theorem) but it is still an amazing fact.

Peter Leopold
  • 1,653
  • 7
  • 21