May you help me to decide what is the minimal sample size for a uniform distributed sample.
Assume that I've find the sample average, standard deviation and the $\alpha$.
May you help me to decide what is the minimal sample size for a uniform distributed sample.
Assume that I've find the sample average, standard deviation and the $\alpha$.
After discussion in the comments, I rewrite most of this answer...
It seems that the question can be interpreted as "how to find a confidence interval for the mean".
1. Without any assumption on the distribution
You sample independent $X_1, \dots, X_n$ from an unknown distribution with finite mean and variance $\mu$ and $\sigma^2$. Using central limit theorem, for $n$ big enough, you can approximate the distribution of the sample mean $\overline X = {1\over n} \sum_{i=1}^n X_i$ by a $\mathcal N \left( \mu, {1\over n}\sigma^2\right)$.
Approximating $\sigma^2$ by $\widehat{\sigma^2} = {n \over n-1} \left( {1\over n} \sum_i X_i^2 - \left(\overline X\right)^2\right)$, an asymptotic confidence interval of level $1-\alpha$ on $\mu$ is given by $$ \left[ \overline X - z_{1-\alpha/2} \sqrt{\widehat{\sigma^2} \over n} ; \overline X + z_{1-\alpha/2} \sqrt{\widehat{\sigma^2} \over n} \right],$$ where $z_{1-\alpha/2}$ is a quantile of the standard normal distribution (eg for $\alpha = 0.05$, $z_{0.975} = 1.96$).
This can be useful to compute a rough estimation of the number of supplementary samples you may need to collect (just use the current estimate of $\widehat{\sigma^2}$ to compute $n$).
2. Assuming that the distribution is uniform
Note that this assumption shouldn’t be made without serious reasons.
So you sample independent $X_1, \dots, X_n$ ($n > 1$) from a uniform distribution $\mathcal U (a,b)$, the bounds of the interval $[a,b]$ being unknown paramaters. The esperance is ${1\over 2}(a+b)$.
The maximum likelihood estimators of $a$ and $b$ are $m = \min_i X_i$ and $M = \max_i X_i$. These are not independant so we consider the density of $(m,M$) which is given by $$\phi(u,v) = \left\{ \begin{array}{ll} {n(n-1) \over (b-a)^n} (v-u)^{n-2} & \mbox{if}\ 0 \le u \le v \le 1 \\ 0 & \mbox{else} \end{array}\right.$$
From this it easy to get the density of ${1\over 2}(m+M)$ which is very concentrated on ${1\over 2}(a+b)$ :
$$f(t) = \left\{ \begin{array}{ll} {n 2^{n-1} \over (b-a)^n} (t-a)^{n-1} & a\le t \le {1\over2}(a+b) \\ {n 2^{n-1} \over (b-a)^n} (b-t)^{n-1} & {1\over2}(a+b)\le t \le b \\ \end{array}\right.$$
To get a confidence interval of the form $$\left[ {1\over2}(m+M) - {\gamma\over 2}(M-m) ;{1\over2}(m+M) + {\gamma\over 2}(M-m) \right],$$ we compute $$\mathbb P \left( {1\over2}(m+M) - {\gamma\over 2}(M-m) \le {1\over2}(a+b) \le {1\over2}(m+M) + {\gamma\over 2}(M-m) \right)$$ which is simply equal to $1 - {1\over (1+\gamma)^{n-1}}$, so to get an CI of level $1-\alpha$ you just put $\gamma = e^{-{1\over n-1}\log \alpha} - 1$. This procedure gives a suprisingly small CI (or more precisely, its size decreases surprisingly fast as $n$ increases; for big $n$ it is $\sim -{1\over n}\log \alpha$).
3. Short illustration with R and $n = 50$, $n=1000$
It is plain that the CI obtained by the second method is much shorter. Just an illustration of this:
> n <- 50
> x <- runif(n)
> gamma <- exp(-log(0.05)/(n-1)) - 1
> m <- min(x); M <- max(x)
> 0.5*( (m+M) + c(-1,1)*gamma*(M-m) )
[1] 0.4799359 0.5404375
> mean(x) + c(-1,1)*1.96*sd(x)/sqrt(n)
[1] 0.4290694 0.5892294
And with $n=1000$
> n <- 1000
...
> 0.5*( (m+M) + c(-1,1)*gamma*(M-m) )
[1] 0.4984559 0.5014571
> mean(x) + c(-1,1)*1.96*sd(x)/sqrt(n)
[1] 0.4805048 0.5161008
4. To understand the discussion in the comments, some elements from the original answer
In the original answer, I foolhardishly proposed to use the CI procedure based on central limit theorem, but using as an estimator of the variance, which is ${1 \over 12} (b-a)^2$, the quantity ${1\over 12} (M-m)^2$. This was a curious mixture, I thank everybody for this stimulating discussion.
The minimum sample size for estimating the sample average is 1. The minimum sample size for estimating the standard deviation is 2. I don't know about the minimum sample size for estimating $\alpha$, as I don't know what you intend $\alpha$ to signify.