4

Given there are 10 RV distributed by $U[0,\theta]$ ($\theta$ supposed to be uknown).I know sample mean ($\bar{X_{10}}$) of and sample variance ($\hat{s_{10}}$), can I found 95% CI for mean?

My answer was correct, yes we can. But I assumed that we can use t-distribution for this, becase sample mean has normal distribution (by CLT) but we have small amount of sample (n=10), hence we use t-distribution.

But it seems to be wrong, we can't use t-distribution to find CI in this case.

Why can't we use t-distribution to find CI in this case? Because $X_i$ is not normally distributed? I want to understand why t-distribution is not applicable here, given that I know sample mean and variance?

Thanks in advance.

PS: The correct deriviation of CI is a little bit complicated and uses Irwin-Hall distribution.

Sharov
  • 251
  • 1
  • 10
  • 1
    because t and normal distributions are approximations – gunes Aug 22 '20 at 13:25
  • @gunes: Thank you for reply, but not clear. We know that Xn for large n has normal distribution, for small t-distr. Why not to use them? – Sharov Aug 22 '20 at 14:28
  • 2
    no, you can’t expect to have mean of Rvs of any distribution to yield t-dist. – gunes Aug 22 '20 at 14:31
  • @gunes: So in this problem because of small n I need to use Irwin-Hall distribution? – Sharov Aug 22 '20 at 15:18
  • 1
    @gunes: Very good point! – Xi'an Aug 22 '20 at 15:28
  • if the question asks for exact CI you should use irwin-hall or maybe Bates – gunes Aug 22 '20 at 16:04
  • @ COOLSerdash: Thank you for your link. I just wanted more theory about why t -distribution is not applicable here. For large $n$ via CLT I can use normal for $\bar{X_n}$, for small $n$ I hoped to use t-distr. But this is not the case... – Sharov Aug 22 '20 at 19:39
  • The sum of iid uniform variables is exceptionally close to Normal even for very small $n.$ See the illustration at the end of https://stats.stackexchange.com/a/43075/919 for the case $n=8.$ – whuber Aug 22 '20 at 22:31
  • Why would the t- statistic (which has a denominator, not just a numerator) have a t-distribution? – Glen_b Aug 23 '20 at 02:58
  • @Glen_b: Good question, I don't know... According to wiki it seems not always to be the case. – Sharov Aug 23 '20 at 11:15

1 Answers1

2

Here is an approach using the maximum observation (the sufficient statistic for $\theta),$ rather than the sample mean and standard deviation. (Of course $\mu = \theta/2$ can also be estimated by $\bar X,$ but with more variability; see Notes at end.)

Let $W$ be the maximum of $n=10$ observations from $\mathsf{Unif}(0, \theta).$ Then it is not difficult to show that $W/\theta \sim \mathsf{Beta}(n, 1):$

$$P\left(\frac{W}{\theta} \le w\right) = P(U_1 \le w, \dots U_{10} \le w)\\ = \prod_{i=1}^{10} P(U_i \le w) = w^n,$$ for $U_i\stackrel{iid}{\sim}\mathsf{Unif}(0,1)$ and $0 \le w \le 1.$ This is the CDF of $\mathsf{Beta}(10,1).$

Thus for $n=10,$ $$P\left(L \le \frac{W}{\theta}\le U\right) = P\left(\frac{W}{U} \le \theta \le \frac{W}{L}\right) = 0.95,$$ where $L$ and $U$ cut probability 0.025 from the lower and upper tails, respectively, of $\mathsf{Beta}(10, 1).$ A 95% CI for $\theta$ is of the form $(W/0.9975,\, W/0.6915).$

qbeta(c(.025,.975),10,1)
[1] 0.6915029 0.9974714

In particular, consider the simulated sample of size $n=10$ below from $\mathsf{Unif}(0, 15).$ The maximum is $W = 14.9248$ and a 95% confidence interval for $\theta$ is $(14.96. 21.58).$

set.seed(822)
x = runif(10, 0, 15)
summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.4545  2.2264  7.1609  7.6550 13.2045 14.9248 
w = max(x)
w/qbeta(c(.975,.025),10,1)
[1] 14.96265 21.58315

Notes: (1) A 95% CI for $\theta$ based on the maximum $X_{(10)} = W$ of $n = 10$ independent observations from $\mathsf{Unif}(0,\theta)$ has average length $0.403$ because $E(W) = \frac{10}{11}\theta.$

diff((10/11)/qbeta(c(.975,.025), 10, 1))
[1] 0.4032641

(2) Because the unbiased MLE of $\mu = \theta/2$ is $0.55W$ a 95% CI for $\mu$ based on the maximum has average length $0.222\theta.$

.55*diff((10/11)/qbeta(c(.975,.025),10,1))
[1] 0.2217953

(3) By contrast, if we use t methods, basing the CI for $\mu = \theta/2$ on the sample mean $\bar X$ and sample standard deviation $S,$ a simulation estimates the average length of a 95% CI is about $0.41\theta,$ considerably longer than the CI based on the maximum.

set.seed(822)
len = replicate(10^5, diff(t.test(runif(10))$conf.int))
mean(len)
[1] 0.4071168

(4) There are at least two similar Q & As on this site, but in my view, neither is a duplicate.

The page linked above uses the mean and variance of a sample of size $n=10$ from $\mathsf{Unif}(0,1)$ to get CIs for $\mu.$ A simulation shows that nominal "95%" CIs really have coverage probability about $94.7\%$ and proposes a more exact interval based on midrange and range.

This page uses a sample of size $n = 10$ from $\mathsf{Unif}(\mu-.5,\mu+.5)$ and an Answer shows that $\bar X \stackrel{aprx}{\sim}\mathsf{Norm}(\mu, \sigma=1/\sqrt{12n}),$ which is used to make a 95% CI.

BruceET
  • 47,896
  • 2
  • 28
  • 76
  • 1
    Thank you for your time and effort. I have few small questions: 1) Why $E(W) = 10/11.$ and not $E(\frac{W}{\theta}) = 10/11$? 2) MLE of μ=θ/2 is 0.55W, where 0.55 comes from? – Sharov Aug 23 '20 at 12:29
  • 1
    Thanks for noting typo (now fixed) $E(W)=\frac{10}{11}\theta.$ Notice that $W \le \theta.$ // Also, $\hat \theta_{unb} = \frac{11}{10}W$ and _unbiased_ MLE of $\mu=\theta/2$ is $\mu_{unb}= \frac{1}{2}\frac{11}{10}W = 0.55W.$ – BruceET Aug 23 '20 at 17:50