Derivation of the Satterthwaite appproximation

Question

Using the method of moments, one can try to approximate the sum of $\chi_{r}^{2}$ variables as $\sum a_{i}Y_{i}$ by equating the $n$-th movements of the sample with the $n$-th movement of the population, and "solve" the parameters this way. However I am stuck in the derivation of Satterthwaite appproximation. The author (Berger&Casella) suggested the following(page 314):

"..to do this we must match second moments, and we need $$\mathrm{E}\left(\sum^{k}_{i=1}a_{i}Y_{i}\right)^{2}=\mathrm{E}\left(\frac{\chi_{v}^{2}}{v}\right)^{2}=\frac{2}{v}+1$$

Applying the method of moments, we can drop the first expectation and solve for $v$, yielding $$\hat{v}=\frac{2}{\left(\sum^{k}_{i=1}a_{i}Y_i\right)^{2}-1}$$"

My naive question is why we can drop the expectation at all? This is not clear from the author's description of the method of moments, in which one merely equate $$m_{j}=\frac{1}{n}\sum^{n}_{i=1}X_{i}^{j}\text{ with } EX^{j}$$ And it seems clear to me that the expectation sign cannot be dropped. Similarly in the last step of the derivation of the approximation formula, the author suggested:

"..substituting this expression for the variance and removing the expectations, we obtain...."(page 315)

Can anyone give a hint? Sorry the question is really "low".

Edit:

A fellow at here suggested that the method of moments assume $E(Z)=Z$ because one equals the two moments. I do not think this follows straightaway from the definition. Even when $j=1$ one has to equate $\frac{1}{n}\sum^{n}_{i=1}X_{i}$ with $EX^{1}$. I do not think this implies $E(Z)=Z$ in general, such that one can use $Z=\sum a_{i}Y_{i}$.

When the $X$'s have the same distribution, $E(\frac{1}{n}\sum^{n}_{i=1}X_{i}) = E(X_j)$ for all $j$ (from basic properties of expectations). When $n=1$ and $j=1$, you equate $E(X_1)$ with $X_1$. Let $Z=X_1$ and by C&B's definition $E(Z)=Z$, ... and so on. — Glen_b, Aug 19 '13 at 01:59
@Glen_b: Thanks for the enlightenment. Now I see everything. — Bombyx mori, Aug 19 '13 at 02:41
@Glen_b: I realized I do not know how to end this thread. Thanks for the reminder. — Bombyx mori, Aug 19 '13 at 03:05

score 4 · Accepted Answer · answered Aug 19 '13 at 23:44

Background: Understanding the method of moments (MoM) in a basic way

Motivation for the method: The (strong) Law of Large Numbers (LLN) gives us reason to think that (at least for large samples), a sample expectation will be close to the population expectation (note that the LLN applies to higher moments by taking $Z=X^j$). Thus, if we have $iid$ $X_i, i=1,\ldots,n$ we have Casella & Berger's $m_j = \frac{1}{n} \sum_{i=1}^n X_i^j$ is set equal to $\text{E}(m_j) = \text{E}(X_i^j) = \mu_j$.

Why you only need consider first moments: Consider Casella & Berger's $m_j = \frac{1}{n} \sum_{i=1}^n X_i^j$ and note that (as we did in the motivating argument), for any $j$ we can just take $Z_i = X_i^j$ and be left with $m_1$ for a different random variable. That is, all MoM estimators can be thought of as first moment MoMs; we can simply make that substitution to get any other moment we need. So MoM is really just setting $m=\mu$ where $m = \frac{1}{n} \sum_{i=1}^n X_i$ for some set of $iid$ $X_i \sim f_X$.

Why you can think of MoM as 'drop expectations': (i) Take $Z = \frac{1}{n} \sum_{i=1}^n X_i$ and note that $\text{E}(Z)=\text{E}(X)$ by linearity of expectation, so MoM simply takes $Z=\text{E}(Z)$. Similarly, taking $Z^j = \text{E}(Z^j)$ follows immediately from the argument we already used - i.e. we can think of MoM as 'drop expectations', and it will be reasonable because we have some random variable which will be close to its expectation; (ii) more generally, we could reasonably do this ('drop expectations') for any $Z$ that we had reason to think would be 'close to' its expectation.

--

Now for the expression in the section relating to Satterthwaite in Casella & Berger

Casella & Berger match first and second moments of $Z=\sum_{i=1}^k a_iY_i$ that is, they take $\text{E}(Z) = Z$ and $\text{E}(Z^2)=Z^2$, the second of which gives an estimate of $\nu$.

Note that $Z=\sum_i a_iY_i$ is a constant times a sample expectation; there's a clear sense in which we might expect that $Z\approx \text{E}(Z)$ and $Z^2 \approx \text{E}(Z^2)$, but we don't actually have to justify it here, we're just following their argument about what happens when we do it.

Since we have known the degrees of freedom of $Y_i$s, it is easy to compute exactly the value of $\mathbb{E}[(\sum_{i=1}^ka_iY_i)^2]$, i.e., the population expectation. What is the motivation to use a sample mean here? — , May 11 '16 at 05:01
@NP-hard The issue is that the actual problem is slightly misrepresented in C&B -- in the original problem under discussion (the one being solved by Satterthwaite) we actually *don't* have chi-square variables, we have sample variances -- which are multiples of chi-square variables with an unknown scale factor because $\sigma_i^2$ is unknown. It must be estimated, so we must use sample variances to do that. So using the method of moments makes sense. If we actually had chi-square variates, it would be a simpler problem. — Glen_b, May 11 '16 at 10:33

Bombyx mori · Answer 2 · 2013-08-19T16:15:52.907

2

As pointed out patiently by Glen_b above, since $X_{i}$ is an independent random sample, we have $$E(\sum a_{i}Y_{i})=\sum a_{i}E(Y_{i})=\sum a_{i} Y_{i}$$where the last equation $$E(Y_{i})=Y_{i}$$ follows from the fact that from the definition of method of moments we have $$E(X)=\frac{\sum X_{i}}{n}=\overline{X}=X \text{when $n$=1} $$So the author's proof is justified.

edited Aug 19 '13 at 16:15

answered Aug 19 '13 at 03:12

Bombyx mori

687
1
6
17

Can you really say $\bar X = X$ at the end there, without say setting $n=1$? Or is that already implied and I missed something? – Glen_b Aug 19 '13 at 03:17
To clarify, it would work if you do that last line differently. – Glen_b Aug 19 '13 at 03:41
@Glen_b: Yeah, I have personal doubts at here. I thought since $X_{i}$ are identically distributed, they all have distribution $X$, so this might work. But after some thought this sounds wrong. – Bombyx mori Aug 19 '13 at 16:15
It's correct as you have it now, but it's perhaps not the most elegant way to do it. – Glen_b Aug 19 '13 at 16:59
@Glen_b: This I surely admit. I was wondering if the last line of derivation enable me to claim $E(Z)=Z$ in general. – Bombyx mori Aug 19 '13 at 20:48
You can't *claim* $E(Z)=Z$, that's almost never true. But MoM surely includes $E(Z)=Z$ (at $n=1$), so you can invoke MoM in order to use it as an estimate. Now in many situations, $n=1$ will give terrible estimates, but in this case, $Z$ is actually just shorthand for something else that's a moment and the Law of Large Numbers (for example, which is what motivates using MoM in the first place) suggest $Z$ will tend to be close to $E(Z)$. The reason why you want to treat it as an $n=1$ version here is that it's in that sense that MoM can be said to act as a 'drop the expectation' operator. – Glen_b Aug 19 '13 at 21:22
I realized there's just not the space in comments to explain all of what I was trying to say, so I have posted my discussion as an answer. – Glen_b Aug 19 '13 at 23:45
@Glen_b: Thanks! Sorry I do not have enough time to write a good enough answer. – Bombyx mori Aug 20 '13 at 02:48

Derivation of the Satterthwaite appproximation

2 Answers2