1

First time to ask a question. I am now reading a textbook "Mathematical Statistics and Data Analysis, 3ed" by Rice. On Page 495, there is a Theorem B for the two-way layout ANOVA.

Let me first give some background. There are $I$ levels for one factor, $J$ levels for another factor. And $K$ independent observations for each of $I \times J$ cell. The statistical model for an observation in a cell is $$ Y_{ijk} = \mu + \alpha_i + \beta_j + \delta_{ij} + \epsilon_{ijk}, $$ with the constraints, $$ \sum_{i} \alpha_i = \sum_{j} \beta_j = \sum_{i} \delta_{ij} = \sum_{j} \delta_{ij} = 0. $$ $\mu$ is the grand average, $\alpha_i$ is $i^{th}$ differential effect of one factor, $\beta_j$ is $j^{th}$ differential effect of another factor, $\delta_{ij}$ is the interaction between $i^{th}$ and $j^{th}$ factor. The error term is normally distributed with mean zero and same variances $\sigma^2$.

We know that $$ SS_{TOT} = SS_A + SS_B + SS_{AB} + SS_E, $$ where \begin{align*} &SS_A = JK \sum_{i=1}^{I} (\bar{Y}_{i..} - \bar{Y}_{...})^2 \\ &SS_B = IK \sum_{j=1}^{J} (\bar{Y}_{.j.} - \bar{Y}_{...})^2 \\ &SS_{AB} = K \sum_{i=1}^{I} \sum_{j=1}^{J} (\bar{Y}_{ij.} - \bar{Y}_{i..} - \bar{Y}_{.j.} + \bar{Y}_{...})^2 \\ &SS_E = \sum_{i=1}^{I} \sum_{j=1}^{J} \sum_{k=1}^{K} (Y_{ijk} - \bar{Y}_{ij.})^2 \\ &SS_{TOT} = \sum_{i=1}^{I} \sum_{j=1}^{J} \sum_{k=1}^{K} (Y_{ijk} - \bar{Y}_{...})^2 \end{align*} Also, we know that \begin{align*} &E[SS_A] = (I-1)\sigma^2 + JK \sum_{i=1}^{I} \alpha_i^2 \\ &E[SS_B] = (J-1)\sigma^2 + IK \sum_{j=1}^{J} \beta_j^2 \\ &E[SS_{AB}] = (I-1)(J-1) \sigma^2 + K \sum_{i=1}^{I} \sum_{j=1}^{J} \delta_{ij}^2 \\ &E[SS_E] = IJ(K-1) \sigma^2 \end{align*} The following image is the aforesaid Theorem B on which I have questions, enter image description here

Proofs on a,b and c are straightforward and my question is on the proofs on d and e,

  1. How to prove d? (I try to do some algebra and want to relate $\bar{Y}_{ij.}$ to $\bar{Y}_{i..}$ and $\bar{Y}_{.j.}$, but stuck).
  2. For e, since $SS_{A}$ and $SS_{B}$ could be both considered as functions of $\bar{Y}_{ij.}$, then it is easy to prove that $SS_{A}$ and $SS_{B}$ are both independent of $SS_{E}$, which allows us to implement F test. But $SS_{A}$ and $SS_{B}$ seems to be dependent, which contradicts with e. (or the author's meaning is just the dependence between $SS_{E}$ and others, but not the dependences among others.)
ttd
  • 11
  • 3

1 Answers1

0

If you substitute the expressions for $\overline Y_{ij.},\overline Y_{i..},\overline Y_{.j.}$ and $\overline Y_{...}$ in $SS_{AB}$, then you get

$$SS_{AB}=K\sum_{i,j}(\delta_{ij}+\overline\varepsilon_{ij.}-\overline\varepsilon_{i..}-\overline\varepsilon_{.j.}+\overline\varepsilon_{...})^2$$

Taking $u_{ij}=\delta_{ij}+\overline\varepsilon_{ij.}$, this can be rewritten as

\begin{align} SS_{AB}&=K\sum _{i,j}(u_{ij}-\overline u_{i.}-\overline u_{.j}+\overline u_{..})^2 \\&=K\sum _{i,j}\left\{(u_{ij}-\overline u_{i.})-(\overline u_{.j}-\overline u_{..})\right\}^2 \\&=K\sum_{i,j}(u_{ij}-\overline u_{i.})^2-IK\sum_j (\overline u_{.j}-\overline u_{..})^2 \tag{1} \end{align}

Now, the $u_{ij}$'s are independent normal variables, i.e. for every $i,j$,

$$u_{ij}\stackrel{\text{ind.}}\sim N\left(\delta_{ij},\frac{\sigma^2}{K}\right)$$

So under $H_{AB}:\delta_{ij}=0$,

$$\sum_j \frac{\left(u_{ij}-\overline u_{i.}\right)^2}{\sigma^2/K}\stackrel{\text{ind.}}\sim \chi^2_{J-1} \quad,\forall\, i$$

This implies

$$\sum_{i,j}\frac{\left(u_{ij}-\overline u_{i.}\right)^2}{\sigma^2/K}\sim \chi^2_{I(J-1)} \tag{2}$$

Similarly,

$$\overline u_{.j}\stackrel{\text{ind.}}\sim N\left(0,\frac{\sigma^2}{IK}\right)\,, $$

so that

$$\sum_j\frac{\left(\overline u_{.j}-\overline u_{..}\right)^2}{\sigma^2/IK}\sim \chi^2_{J-1} \tag{3}$$

If you could argue that $\sum_j \left(\overline u_{.j}-\overline u_{..}\right)^2$ and $\sum_{i,j}(u_{ij}-\overline u_{i.}-\overline u_{.j}+\overline u_{..})^2$ are independent, then $(1),(2)$ and $(3)$ would give you the desired distribution of $SS_{AB}/\sigma^2$ under $H_{AB}$. Or you could use general theory from distribution of quadratic forms to establish this. It also might be of interest to refer to the Fisher-Cochran theorem on quadratic forms.

I am not sure of part (e) of the theorem in your post. But it is certainly true that $SS_E$ is independent of each of $SS_A,SS_B$ and $SS_{AB}$.

StubbornAtom
  • 8,662
  • 1
  • 21
  • 67
  • Thanks and sorry for the late reply due to being busy with other stuffs. I follow the approach (but keep $\epsilon$ notation) and also get $$\frac{1}{\sigma^2/K} \sum_{i,j} (\bar{\epsilon}_{ij.} - \bar{\epsilon}_{i..})^2 \sim \chi^2_{I(J-1)}$$ and $$\frac{1}{\sigma^2/IK} \sum_{j} (\bar{\epsilon}_{.j.} - \bar{\epsilon}_{...})^2 \sim \chi^2_{J-1}.$$ Then, I try to prove $\sum_{i,j} (\bar{\epsilon}_{ij.} - \bar{\epsilon}_{i..})^2$ and $\sum_{j} (\bar{\epsilon}_{.j.} - \bar{\epsilon}_{...})^2$ are independent. – ttd Oct 12 '21 at 15:30
  • My idea is as follows: The first term is function of the vector $U = (\bar{\epsilon}_{11.} - \bar{\epsilon}_{1..}, \bar{\epsilon}_{11.} - \bar{\epsilon}_{1..}, \dotsc)$, the second term is function of the vector $V=(\bar{\epsilon}_{.1.} - \bar{\epsilon}_{...}, \bar{\epsilon}_{.1.} - \bar{\epsilon}_{...}, \dotsc)$. Since $X = (U, V)$ is multivariate normal, thus if we can prove every element of $U$ is uncorrelated with every element of $V$, then $U$ and $V$ are independent, which completes the proof. – ttd Oct 12 '21 at 15:41
  • Re-label element in $U$ and $V$ as $\bar{\epsilon}_{ij_{1}.} - \bar{\epsilon}_{i..}$ and $\bar{\epsilon}_{.j_{2}.} - \bar{\epsilon}_{...}$. For $j_1 \neq j_2$, they are independent. But for $j_1 = j_2 = j$, I try to compute $$ \text{Cov}(\bar{\epsilon}_{ij.} - \bar{\epsilon}_{i..}, \,\bar{\epsilon}_{.j.} - \bar{\epsilon}_{...} ) $$ but get nonzero result. I don't know where i am wrong. – ttd Oct 12 '21 at 15:46