Is my work correct (easy problem, confidence intervals)

Question

The r.v. $X$ represents the time taken by a computer in company $1$ in order to perform a certain job, and $Y$ represents the same thing but for company $2$. A sample of $n_X = 12$ computers are taken from company $1$, and we obtain: $\bar x = 65$, $s_X ^2 = 279$. A sample of $n_Y = 8$ computers are taken from company $2$ and we get $\bar y = 48$, $s_Y ^2 = 224$.

I am required to find a $.95$ confidence interval for the difference between the means of the two populations.

What I did:

Because $\bar x > \bar y$ let's find the C.I for the difference $\mu_X - \mu_Y$. To do this, we note that:

The variances are unknown, and $n_X + n_Y - 2 = 18 \le 30$ is small. Then, we must consider:

$$T = \frac{(\bar X - \bar Y) - (\mu_X - \mu_Y)}{\hat \sigma \sqrt{\frac1{n_X} + \frac1{n_Y}}}$$

Where:

$$\hat \sigma^2 = \frac{n_X S_X ^2 + n_Y S_Y ^2}{n_X + n_Y -2}$$

$T$ has a t-student distribution with degrees of freedom $\nu = n_X + n_Y - 2 = 18$.

$$- t \le T \le t \iff - t \le \frac{(\bar X - \bar Y) - (\mu_X - \mu_Y)}{\hat \sigma \sqrt{\frac1{n_X} + \frac1{n_Y}}} \le t \iff ... \iff \\ (\bar X - \bar Y) - t \hat \sigma \sqrt{\frac1{n_X} + \frac1{n_Y}} \le \mu_X - \mu_Y \le (\bar X - \bar Y) + t \hat \sigma \sqrt{\frac1{n_X} + \frac1{n_Y}}$$

Now we find $t$ from the table, and replace all the known values to get:

C.I $= \left[ 0.457, 33.542 \right]$

I don't care about the part with calculations, but my question is:

Is my work correct?

The next part of the question is asking to find whether we can say the company $1$ has faster computers than company $2$ at a risk $.05$. I know how to do this by testing the hypothesis $\mu_X = \mu_Y$ against $\mu_X > \mu_Y$. But is there a way to do it that makes use of the first part?

Please add the `[self-study]` tag & read its [wiki](http://stats.stackexchange.com/tags/self-study/info). — gung - Reinstate Monica, Sep 03 '15 at 01:49
@gung: I added it. the specific problem that I encountered is the whether I interpreted the question and applied the formula correctly. — George, Sep 03 '15 at 01:52
Crossposted on Math: http://math.stackexchange.com/q/1419054/23353 — apnorton, Sep 03 '15 at 13:38

score 1 · Accepted Answer · answered Sep 03 '15 at 02:21

1

Your formulas are correct, but the calculations might not be very accurate.

n1<-12
n2<-8
x_bar<-65
y_bar<-48
sx_2<-279
sy_2<-224

sp<-sqrt(((n1-1)*sx_2+(n2-1)*sy_2)/(n1+n2-2))


t<-qt(0.975,n1+n2-2)#with 0.975 and 18 df


CL1<-(x_bar-y_bar)-t*sp*sqrt(1/n1+1/n2) #lower 95% CI
CL1

#1.608831

CL2<-(x_bar-y_bar)+t*sp*sqrt(1/n1+1/n2) #higher 95% CI
CL2

# 32.39117

95% CI: 1.61-32.4

Since the 95% CI does not include 0, I think you can say company 1 's computer is faster than company 2 at 0.05 type I error level.

answered Sep 03 '15 at 02:21

Deep North

4,527
2
18
38

thanks. I define the $s_X^2$ to be the sample variance and **not** the estimation of $\sigma_X ^2$. Sorry for the confusion. I guess with that my calculations are correct? – George Sep 03 '15 at 02:25
Yes, your formulas and procedures are correct, but may need to check the calculation. – Deep North Sep 03 '15 at 02:27
Pardon me, but I see that you used $\frac{(n_1 -1)s_X^2 + (n_2 - 1)s_Y ^2}{n_1 + n_2 - 2}$ instead of $\frac{n_1 s_X^2 + n_2 s_Y ^2}{n_1 + n_2 - 2}$. so it seems to me that you are thinking of $s_X^2$ as the point estimation of the population variance, while it's in fact the sample variance. So i want to know whether there's a misunderstanding or an error with my formula. thanks for your help. – George Sep 03 '15 at 02:32
My question is: do you use the symbol $s_X ^2$ to denote the unbiased estimator of the population variance? I define $s_X ^2$ to be the sample variance, and $\frac{n_X}{n_X - 1} \times s_X ^2$ to be the unbiased estimator of the population variance. Some people define $s_X ^2$ otherwise (i.e. it being the unbiased estimator). So I want to know if there's a misunderstanding. thanks again for your time. – George Sep 03 '15 at 02:45
Ok, I see the confusion, some people define sample variance as $s=\sum (x_i-\bar{x})/(n-1)$ some people define it as $s=\sum (x_i-\bar{x})/n$ – Deep North Sep 03 '15 at 03:12
Exactly. I define it as the last one (with $n$). – George Sep 03 '15 at 03:13
So according to my definition, would my formula be correct? – George Sep 03 '15 at 03:17
and here, http://stats.stackexchange.com/questions/3931/intuitive-explanation-for-dividing-by-n-1-when-calculating-standard-deviation – Deep North Sep 03 '15 at 03:25
It does not matter too much. You can read http://stats.stackexchange.com/questions/100041/how-exactly-did-statisticians-agree-to-using-n-1-as-the-unbiased-estimator-for – Deep North Sep 03 '15 at 03:31
Ok, I miss a square in the sample variance formula, always make mistakes. – Deep North Sep 03 '15 at 04:48

Is my work correct (easy problem, confidence intervals)

1 Answers1