12

I need to show that F test is equal to T test squared, when the T test is for 2 independent groups and assuming variances are equal.

I know that $F=\frac{MSB}{MSW}=\frac{SSB/k-1}{SSW/N-K}$ and I know that $T=\frac{X-Y}{S_p \sqrt{\frac{1}{n}+\frac{1}{m}}}$,

so $T^2=\frac{(X-Y)^2}{S_p^2 ({\frac{1}{n}+\frac{1}{m}})}$

I've seen this proof in Regression but here we're not using MSE and MSR, so i'm not sure how to connect between the two.

Ferdi
  • 4,882
  • 7
  • 42
  • 62
Manko
  • 511
  • 2
  • 4
  • 11
  • Any proof is a series of steps that logically connect hypotheses to a conclusion. Although your conclusion is clearly stated, what hypotheses do you want to begin with? After all, because both procedures test the same thing by means of the same hypothesis-testing framework using the same assumptions, on that account alone they *must* be equivalent! So if that's not a sufficient proof for you, please indicate where you are beginning and what methods of proof are desired. – whuber Apr 05 '13 at 13:20

3 Answers3

15

I have done this proof in my blog
Since I already have the code for the equations, I'm reproducing it here.


We have to prove that

$$F_{a-1, N-a} = \frac{MST}{MSE} = \frac{\frac{SST}{a-1}}{\frac{SSE}{N-a}} \tag{1}$$

reduces to

$$t_{k}^2 = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{S_{p}^2(\frac{1}{n_{1}} + \frac{1}{n_{2}})} \tag{2}$$

$\color{red} {\text{When a = 2}}$ (this is key)


Notation

$SSE$: Sum of Squares due to Error
$SST$: Sum of Squares of Treatment
$MSE$: Mean Sum of squares Error
$MST$: Mean Sum of squares Treatment
$a$: Number of treatments
$n_{1}$: Number of observations in treatment 1
$n_{2}$: Number of observations in treatment 2
$N$: Total number of observations
$\bar{y}_{i.}$: Mean of treatment $i$
$\bar{y}_{..}$: Global mean
$k = N - a$: Degrees of freedom of the denominator of F


Now that we have the formulas, we will work the following:

  1. Denominator of equation (1)
  2. Numerator of equation (1)
    2.a. Part a
    2.b. Part b
    2.c. Part c
  3. Put all together

1. Denominator of equation (1)

When $a = 2$ the denominator of expression $(1)$ is:

$$MSE = \frac{SSE}{N-2} = \frac{\sum_{j=1}^{n_1}{(y_{1j} - \bar{y}_{1.})^2} + \sum_{j=1}^{n_2}{(y_{2j} - \bar{y}_{2.})^2}}{N-2} \tag{3}$$

Recalling that the formula for the sample variance estimator is, $$S_{i}^2 = \frac{\sum_{j=1}^{n_i}(y_{ij} - \bar{y}_{i.})^2}{n_{i} - 1}$$ we can multiply and divide the terms in the numerator in $(3)$ by $(n_{i} - 1)$ and get $(4)$. Don't forget that in this case $N = n_{1} + n_{2}$

$$\frac{SSE}{N-2} = \frac{(n_{1} - 1) S_{1}^2 + (n_{2} - 1) S_{2}^2}{n_{1} + n_{2} - 2} = S_{p}^2 \tag{4}$$

$S_{p}^2$ is called the pooled variance estimator.


2. Numerator of equation (1)

When $a = 2$ the numerator of expression $(1)$ is:

$$\frac{SST}{2-1} = SST$$

and the general expression for SST reduces to $SST = \sum_{1}^2 n_{i} (\bar{y}_{i.} - \bar{y}_{..})^2$ . The next step is to expand the sum as follows:

$$SST = \sum_{1}^2 n_{i} (\bar{y}_{i.} - \bar{y}_{..})^2 = n_{1} (\bar{y}_{1.} - \bar{y}_{..})^2 + n_{2} (\bar{y}_{2.} - \bar{y}_{..})^2 \tag{5}$$

$\bar{y}_{..}$ is called the global mean and we are going to write it in a different way. The new way is:

$$\bar{y}_{..} = \frac{n_{1} \bar{y}_{1.} + n_{2} \bar{y}_{2.}}{N} \tag{6}$$

Next, replace (6) in formula (5) and re-write SST as:

$$SST = \underbrace{n_1 \big[ \bar{y}_{1.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2}_{\text{Part a}} + \underbrace{n_2 \big[ \bar{y}_{2.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2}_{\text{Part b}} \tag{7}$$

The next step is to find alternative ways for the expressions Part a and Part b


2.a. Part a

$$\text{Part a} = n_1 \big[ \bar{y}_{1.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

Multiply and divide the term with $\bar{y}_{1.}$ by $N$

$$n_1 \big[ \frac{N \bar{y}_{1.}}{N} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

$N$ is common denominator

$$n_1 \big[\frac{N \bar{y}_{1.} - n_1 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

$\bar{y}_{1.}$ is common factor of $N$ and $n_1$

$$n_1 \big[\frac{(N - n_1) \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

Replace $(N - n_{1}) = n_{2}$

$$n_1 \big[\frac{n_2 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

Now $n_{2}$ is common factor of $\bar{y}_{1.}$ and $\bar{y}_{2.}$

$$n_1 \big[\frac{n_2 (\bar{y}_{1.} - \bar{y}_{2.})}{N} \big]^2$$

Take $n_{2}$ and $N$ out of the square

$$\text{Part a} = \frac{n_{1} n_{2}^2}{N^2} (\bar{y}_{1.} - \bar{y}_{2.})^2$$


2.b. Part b

$$\text{Part b} = n_2 \big[ \bar{y}_{2.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

Multiply and divide the term with $\bar{y}_{2.}$ by $N$

$$n_2 \big[ \frac{N \bar{y}_{2.}}{N} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

$N$ is common denominator

$$n_2 \big[\frac{N \bar{y}_{2.} - n_1 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

$\bar{y}_{2.}$ is common factor of $N$ and $n_2$

$$n_2 \big[\frac{(N - n_2) \bar{y}_{2.} - n_1 \bar{y}_{1.}}{N} \big]^2$$

Replace $(N - n_{2}) = n_{1}$

$$n_2 \big[\frac{n_1 \bar{y}_{2.} - n_1 \bar{y}_{1.}}{N} \big]^2$$

Now $n_{1}$ is common factor of $\bar{y}_{1.}$ and $\bar{y}_{2.}$

$$n_2 \big[\frac{n_1 (\bar{y}_{2.} - \bar{y}_{1.})}{N} \big]^2$$

Take $n_{1}$ and $N$ out of the square

$$\text{Part b} = \frac{n_{2} n_{1}^2}{N^2} (\bar{y}_{2.} - \bar{y}_{1.})^2$$


Now that we have Part a and Part b we are going to go back to equation $(7)$ and replace them:

$$SST = \frac{n_{1} n_{2}^2}{N^2} (\bar{y}_{1.} - \bar{y}_{2.})^2 + \frac{n_{2} n_{1}^2}{N^2} (\bar{y}_{2.} - \bar{y}_{1.})^2 \tag{8}$$

Taking into account that $(\bar{y}_{1.} - \bar{y}_{2.})^2 = (\bar{y}_{2.} - \bar{y}_{1.})^2$, we can re-write equation $(8)$ as $(9)$:

$$SST = \underbrace{\big[ \frac{n_{1} n_{2}^2}{N^2} + \frac{n_{2} n_{1}^2}{N^2} \big]}_{\text{Part c}} (\bar{y}_{1.} - \bar{y}_{2.})^2 \tag{9}$$

This lead us with part Part c, that we are going to work next.


2.c. Part c

$$\text{Part c} = \frac{n_{1} n_{2}^2}{N^2} + \frac{n_{2} n_{1}^2}{N^2}$$

$N^2$ is common denominator and each of the summands has a $n_{1} n_{2}$ factor that we can factor out. Then we have:

$$\frac{n_{1} n_{2} (n_{1} + n_{2})}{N^2}$$

Replace $N = n_{1} + n_{2}$

$$\frac{n_{1} n_{2} N}{N^2}$$

Simplify $N$

$$\frac{n_{1} n_{2}}{N}$$

Re-write the fraction

$$\frac{1}{\frac{N}{n_{1} n_{2}}}$$

Replace $N = n_{1} + n_{2}$

$$\frac{1}{\frac{n_{1} + n_{2}}{n_{1} n_{2}}} = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$

And we have

$$\text{Part c} = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$


Finally, we have to replace this expression for Part c in $(9)$ and re-write SST as:

$$SST = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}} (\bar{y}_{1.} - \bar{y}_{2.})^2$$


3. Put all together

With the previous steps we have shown that, $\color{red} {\text{when a = 2}}$, we have:

$$\frac{SST}{2-1} = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$

and

$$\frac{SSE}{N-2} = S_{p}^2$$

The ratio of these two expressions, namely the F-statistic, is then:

$$F_{1, k} = \frac{\frac{SST}{2-1}}{\frac{SSE}{N-2}} = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{S_{p}^2 \big( \frac{1}{n_{1}} + \frac{1}{n_{2}} \big)} = t_{k}^2$$

And this concludes the proof.

canovasjm
  • 333
  • 1
  • 3
  • 6
8

Because one has $\boxed{T^2=F}$.

To show that, you have to check that (with $N=mn$):

  • $SSW/(N-2)= S^2_p$ (the unbiaised estimate of $\sigma^2$)

  • $SSB = {(\bar X- \bar Y)}^2/(\frac{1}{n}+\frac{1}{m})$

To show the second point you only have to use :

  • the elementary equality $SSB=m{(\bar x - \bar{x\cdot y})}^2+n{(\bar y - \bar{x\cdot y})}^2$

  • the fact that the mean of the whole sample $x\cdot y=(x_1, \ldots, x_m, y_1, \ldots y_n)$ is the weighted mean $\frac{m \bar x + n \bar y}{m+n}$

  • some elementary but a little tiedous calculations to conclude

Sorry for the strange notation $x\cdot y$ for the "whole sample", this was my first idea and I'm in a hurry now.

Stéphane Laurent
  • 17,425
  • 5
  • 59
  • 101
4

You can rewrite the equation as \begin{equation}\frac{SSB/\left(k-1\right)}{SSW/\left(N-k\right)}=\frac{SSB\left(k-1\right)/\sigma^{2}\left(k-1\right)^{2}}{SSW\left(N-k\right)/\sigma^{2}\left(N-k\right)^{2}} \end{equation} For $k=2$ (two groups), \begin{equation}\frac{SSB\left(k-1\right)/\sigma^{2}\left(k-1\right)^{2}}{SSW\left(N-k\right)/\sigma^{2}\left(N-k\right)^{2}}=\frac{SSB/\sigma^{2}}{SSW\left(N-2\right)/\sigma^{2}\left(N-2\right)^{2}}. \end{equation} The numerator is a $\chi^{2}$ distribution with one degree of freedom. The denominator has the following distribution: \begin{equation} SSW\left(N-2\right)/\sigma^{2}\left(N-2\right)^{2}\sim\frac{\chi_{N-2}^{2}}{\left(N-2\right)^{2}}. \end{equation} Therefore, you have the ratio of two $\chi^{2}$ distributions. This ratio is equivalent to a $t$ distribution with $N-1$ degrees of freedom squared: \begin{equation}\frac{\chi_{1}^{2}}{\chi_{N-2}^{2}/\left(N-2\right)^{2}}\sim t_{N-2}^2. \end{equation}

wcampbell
  • 2,099
  • 17
  • 19
  • Why are we assuming that k=1? And regarding the equality, shouldn't the numerator 0 if k=1 because k-1=0? – Manko Apr 05 '13 at 15:50
  • Sorry, k = 2 because you have two groups. I edited my post. – wcampbell Apr 05 '13 at 15:54
  • :) so shouldn't $SSW(N−1)/σ2(N−1)^2$ be $SSW(N−2)/σ2(N−2)^2$..? and we when reach the end of the equation.. we reach $t_{N-1}$? – Manko Apr 05 '13 at 15:57
  • Another mistake! I fixed that one too. At the end, you get $t_{N-2}$ which is the correct number of degrees of freedom for a two sample $t$ test. – wcampbell Apr 05 '13 at 17:47
  • 5
    You don't prove that $F=T^2$, you only prove that $F$ has *the same law* as $T^2$. By the way you make a mistake: $\sqrt{F}$ has not a $t$-distribution since it is distributed on positive numbers. – Stéphane Laurent Apr 05 '13 at 19:58