Confidence Interval for Pairs of Normal Data

Question

Suppose we have $\alpha_1,..\alpha_n,\beta_1,..\beta_n$~$N(0,1)$.

Define $x_i = u_i + \sigma\alpha_i$, $y_i = u_i + \sigma\beta_i$, for $(\mu_1,..,\mu_n) \in \mathbb{R}^n$, $\sigma > 0$.

Consider the model $(x_1,y_1),..,(x_n,y_n)$, for it we get the likelihood function which will yield:

$\mu_i* = \frac{x_i+y_i}{2}$ as the MLE estimator for $\mu_i$, and $\sigma* = \frac{1}{4n}\sum^{23}_{i=1}(x_i-y_i)^2$.

It follows the $\mu_i*$ is unbiased for $\mu_i$, and $2\sigma*$ for $\sigma$.

I want to build a $0.99$ confidence interval for $\mu_i$, supposing $n=23$;

I'm unsure which is the correct way to calculate this -

either $[\mu_i* \pm \frac{\sqrt{\sigma*}}{\sqrt{23}}t_{22,.975}]$

or $[\mu_i* \pm \frac{\sqrt{\sigma*}}{\sqrt{2}}t_{22,.975}]$. I think the latter is correct since we would base this CI on the Cental Limit Theorem using the mean of $x_i$ and $y_i$.

Is $u_i$ indeed dependent on $i$? If so, why is $u*$ without an index - shouldn't it depend on $i$ too? Also, where did $u$ come from, and what is its relationship to the $u_i$? — Ami Tavory, Jul 01 '17 at 17:23
You still have "It follows the $\mu*$ is unbiased for $\mu$, and $2\sigma*$ for $\sigma$.". Also, I don't quite get the relevance of $n$. If indeed $\mu_i$ and $\sigma_i$ are dependent on $i$ and unrelated, why does having 23 of them make a difference? — Ami Tavory, Jul 01 '17 at 17:45
@AmiTavory edited again :). It makes a difference only because our estimate of $\sigma$ is dependent upon $23$ samples of pairs, not $2$. — Mariah, Jul 01 '17 at 17:59
Ah, much clearer now. I was guessing that the $\sigma$ was a typo too, and it was actually $\sigma_i$. — Ami Tavory, Jul 01 '17 at 18:17

Ami Tavory · Accepted Answer · 2017-07-01T20:52:05.190

Suppose we have $\alpha_1,..\alpha_n,\beta_1,..\beta_n$~$N(0,1)$.... Define $x_i = u_i + \sigma\alpha_i$, $y_i = u_i + \sigma\beta_i$, for $(\mu_1,..,\mu_n) \in \mathbb{R}^n$, $\sigma > 0$... $\sigma* = \frac{1}{4n}\sum^{23}_{i=1}(x_i-y_i)^2$... I want to build a $0.99$ confidence interval for $\mu_i$, supposing $n=23$;

When considering the confidence interval for $\mu_i$, there are two things that need to be taken into account:

The estimate of $\sigma$ might be off.
For the true $\sigma$, the outcome of $x_i - y_i$ might be off.

Let's consider the first item. $ x_i - y_i \sim \mathcal{N}(0, 2\sigma^2)$. As such, $$ \frac{1}{2n} \sum_{i = 1}^n \left( x_i - y_i \right)^2 $$ is a known-mean unbiased estimator for $\sigma^2$ (why $4n$ in your question?). However, $ \frac{1}{2n} \sum_{i = 1}^n \left( x_i - y_i \right)^2 $ is itself a random variable, and so has variance. Using the properties of chi-square distributions (see this question), an $\epsilon$ confidence interval for $2 \sigma^2$ is

$$ \left(\frac{S^2 n}{\chi^2_{1-\epsilon2}}, \frac{S^2 n}{\chi^2_{\epsilon/2}}\right). $$

Obviously, the worst case is the right boundary of this interval. If you plug in this value for $2 \sigma^2$, you can calculate a $\delta$ interval for the normal distribution using the usual method.

However, note that you need to take both uncertainties into account. One way of upper-bounding the probability of error would be to upper-bound it by taking the probability of error as $(1 - \epsilon) ( 1 - \delta)$. If you're looking for a 99% CI, you need to use $\epsilon, \delta$, s.t. $(1 - \epsilon) ( 1 - \delta) \geq 0.99$. One way of doing so (not necessarily the optimal) would be taking $\epsilon = \delta = 1 - \sqrt{0.99}$.

This is a good answer thanks. I'm wondering specifically whether, assuming I keep the format of the CI's as I wrote them, to divide in the denominator by $\sqrt(2)$ or by $\sqrt(23)$; as I said, I think it's by $\sqrt(2)$ — Mariah, Jul 01 '17 at 20:43
@Mariah I'll update that shortly. However, I think you need to take into account the confidence of the estimator for $2 \sigma^2$. IINM, you're not doing that in the calculation outlined in your answer, no? There's also the factor $\frac{1}{4n}$ where I don't follow your reasoning. — Ami Tavory, Jul 01 '17 at 20:46
@Mariah Regarding your original question, I agree with what you wrote there - you should use the second form. I disagree with the two other points above, though. — Ami Tavory, Jul 01 '17 at 20:55

Confidence Interval for Pairs of Normal Data

1 Answers1