How can I show that $\frac{1}{\sigma^2}\sum^k_{i=1}n_i[(\bar{Y}_{i.}-\bar{\bar{Y}})-(\theta_i-\bar{\theta)}]^2 \sim \chi^2_{k-1}$?

Question

Define $\bar{\bar{Y}}=\sum n_i \bar{Y}_{i.}/\sum n_i$ and $\bar{\theta}=\sum n_i\theta_i / \sum n_i$, where $Y_i \sim N(\theta,\sigma^2)$. How does can I show that $\frac{1}{\sigma^2}\sum^k_{i=1}n_i[(\bar{Y}_{i.}-\bar{\bar{Y}})-(\theta_i-\bar{\theta)}]^2 \sim \chi^2_{k-1}$ under ANOVA assumptions?

my work:

Let $\bar{U}_i=\bar{Y}_{i.}-\theta_i$, for $i=1,...,n$. So, $\bar{U}_i \sim N(0,\frac{\sigma^2}{n_i})$.

Let $\bar{\bar{U}}=\bar{\bar{Y}}-\bar{\theta}$. So, $\bar{\bar{U}} \sim N(0,\frac{\sigma^2}{\sum n_i})$.

The linear combination $\bar{U}_i-\bar{\bar{U}} \sim N(0,\sigma^2(\frac{1}{n_i}+\frac{1}{\sum n_i}))$.

Hence we can rewrite the original expression as $\frac{1}{\sigma^2}\sum^k_{i=1}n_i(\bar{U}_{i}-\bar{\bar{U}})^2$.

I feel like my work is close, but I messed up somewhere, and I need help finding my error. Due to my variance term of the distribution given by $\bar{U}_{i}-\bar{\bar{U}}$, the given expression does not appear to follow a $\chi^2_{k-1}$ distribution. Where did I mess up?

updated work:

I have that $\frac{\sum n_i \bar{U}_i^2}{\sigma^2}=\frac{\sum n_i (\bar{U}_i-\bar{\bar{U}})^2}{\sigma^2}+\frac{\sum n_i \bar{\bar{U}}^2}{\sigma^2}$, where $\frac{\sum n_i \bar{U}_i^2}{\sigma^2} \sim \chi^2_k$ and $\frac{\sum n_i \bar{\bar{U}}^2}{\sigma^2}\sim \chi^2_1$

However, I need to show that the two added terms on the right-hand side of the inequality are independent. How may I go about this?

$\bar U_i$ and $\bar{\bar U}$ are not independent. // For simplicity, maybe begin with a balanced design with all $n_i = n.$ Then the sample mean of the $k$ group means $\bar U_i$ is $\bar{\bar U}.$ For a random sample $X_1, \dots, X_n,$ from $\mathsf{Norm}(\mu, \sigma),$ it is known that $\frac{(n-1)S^2}{\sigma^2} \sim \mathsf{Chisq}(\nu = n-1).$ — BruceET, May 04 '20 at 22:56
@BruceET I see. I made a mistake in assuming that $\bar{U}_i$ and $\bar{\bar{U}}$ were independent. Since $\bar{U}_i \sim N(0,\frac{\sigma^2}{n})$, then we know that $\frac{n_i\bar{U}_i}{\sigma^2} \sim \chi^2_{n_i-1}$. So, $\bar{\bar{U}}=\frac{\sum n_iU_i/\sigma^2}{\sum n_i / \sigma^2}$. Is this the direction that I should be pursuing? — Jen Snow, May 04 '20 at 23:28
@BruceET I also see that $s^2_k=n_1(\bar{U}_1-\bar{\bar{U}})^2+...+n_k(\bar{U}_k-\bar{\bar{U}})^2=\frac{(\bar{U}_1-...-\bar{U}_k)^2}{\frac{1}{n_1}+...+\frac{1}{n_k}}$ , where $\bar{U}_1-...-\bar{U}_k \sim N(0,\sigma^2(\frac{1}{n_1}+...+\frac{1}{n_k}))$. I just don't see how I can tie these facts together to show that the given expression follows a $\chi^2_{k-1}$ . — Jen Snow, May 04 '20 at 23:31
@BruceET I added some updated work to the post, but I am stuck at showing independence again. Do you have any suggestions from here aside from finding the joint of $(U_1,...,U_{k-1})$ then using a Jacobian transformation to show that it factorizes? — Jen Snow, May 05 '20 at 19:09

score 2 · Accepted Answer · answered Jun 17 '20 at 20:15

I assume the ANOVA model is

$$Y_{ij}=\theta_i+\varepsilon_{ij}\quad,\small\,i=1,2,\ldots,k\,;\,j=1,2,\ldots,n_i$$ where $\varepsilon_{ij}$'s are i.i.d $N(0,\sigma^2)$ for all $i,j$. In other words, $Y_{ij}\sim N(\theta_i,\sigma^2)$ independently $\forall\, i,j$.

Mean of the $i$th group is $$\overline {Y_{i\cdot}}=\frac1{n_i}\sum\limits_{j=1}^{n_i}Y_{ij}\quad,\, i=1,\ldots,k$$

The grand mean is then $$\overline Y=\frac{\sum_{i=1}^k n_i\overline {Y_{i\cdot}}}{\sum_{i=1}^k n_i}$$

You also defined $$\overline\theta=\frac{\sum_{i=1}^k n_i \theta_i}{\sum_{i=1}^k n_i}$$

Now $\overline {Y_{i\cdot}}\sim N\left(\theta_i,\frac{\sigma^2}{n_i}\right)$ independently for each $i$, so that

$$X_i=\overline {Y_{i\cdot}}-\theta_i\stackrel{\text{ ind.}}\sim N\left(0,\frac{\sigma^2}{n_i}\right)\quad,\,i=1,\ldots,k$$

We also have the weighted average

$$\overline X_w=\overline Y-\overline\theta=\frac{\sum_{i=1}^k n_i(\overline {Y_{i\cdot}}-\theta_i)/\sigma^2}{\sum_{i=1}^k n_i/\sigma^2}=\frac{\sum_{i=1}^k w_i X_i}{\sum_{i=1}^k w_i}\,,$$

where $w_i=\frac{n_i}{\sigma^2}$ are the weights.

As you said, the problem boils down to finding the distribution of the weighted sum of squares

$$S^2=\sum_{i=1}^k \frac{n_i}{\sigma^2}\left\{(\overline {Y_{i\cdot}}-\theta_i)-(\overline Y-\overline\theta)\right\}^2=\sum_{i=1}^k w_i(X_i-\overline X_w)^2$$

Using general facts about distributions of quadratic forms (like some form of Cochran's theorem) it can be shown that $S^2\sim \chi^2_{k-1}$, but for a more instructive derivation using orthogonal transformations you can refer to this post on Math.SE. The independence of $\overline X_w$ and $S^2$ can also be shown this way.

Thank you so much for providing this answer! This makes so much sense. I appreciate your detailed breakdown. — Jen Snow, Jun 18 '20 at 18:30
This can also be answered using [this](https://stats.stackexchange.com/q/188626/119261) theorem. — StubbornAtom, Jul 11 '20 at 14:42

How can I show that $\frac{1}{\sigma^2}\sum^k_{i=1}n_i[(\bar{Y}_{i.}-\bar{\bar{Y}})-(\theta_i-\bar{\theta)}]^2 \sim \chi^2_{k-1}$?

1 Answers1

Linked