3

Having consulted a number of sources, I still can't find a complete proof that Regression Sum of Squares ($SS_{regression}$) and ($SS_{residual}$) are independent random variables. I'll be doubly pleased with a proof in matrix form. If it is too involved to be typed up here, I am happy to check it out myself if people can direct me to a book that contains the proof.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
ColorStatistics
  • 2,699
  • 1
  • 10
  • 26
  • 3
    https://stats.stackexchange.com/questions/117406/proof-that-the-coefficients-in-an-ols-model-follow-a-t-distribution-with-n-k-d?noredirect=1&lq=1 should be helpful – Christoph Hanck Feb 19 '21 at 06:49
  • I see. Is the idea that $SS_{regression}$ is a function of $\hat{y}$ and $SS_{residual}$ a function of $\hat{\epsilon}$ and thus conclude $SS_{residual}$ is independent of $SS_{regression}$ once we can show that $\hat{y}$ and $\hat{\epsilon}$ are uncorrelated, given the assumption of marginally normal errors, with constant variance? – ColorStatistics Feb 19 '21 at 13:45

1 Answers1

3

Assume $y \sim \operatorname{Normal}(\beta X, \Sigma)$ with constant diagonal covariance $\operatorname{Cov}(y)=\Sigma=\sigma^2 \mathbb I$ and mean $\bar y=\mu$. Using the hat matrix $\mathbb H$ we have that:

$$\hat y =\mathbb H y$$

And

$$\epsilon=y-\hat y=(\mathbb I -\mathbb H)y$$

Then (because the hat matrix is idempotent $\mathbb H^2 = \mathbb H$) $$\begin{cases} SSR = (\hat y - \mu)^T(\hat y - \mu)= y^T \mathbb Hy \color{red}{+n\mu^2-2\mu\mathbf 1^T\hat y}= y^T \mathbb Hy \color{red}{-n\mu^2}\\ SSE = \epsilon ^T \epsilon = y^T (\mathbb I -\mathbb H)y \end{cases}$$

$$\operatorname{Cov}(SSR,SSE) = \operatorname{Cov}(y^T \mathbb Hy,y^T (\mathbb I -\mathbb H)y)$$

Using that $\operatorname{Cov}(x^TAx,x^TBx)=4\mu^TA\Sigma B\mu + 2\operatorname{tr}(A\Sigma B\Sigma)$ (see Prove that $\mathrm{Cov}(x^TAx,x^TBx) = 2 \mathrm{Tr}(A \Sigma B \Sigma) + 4 \mu^TA \Sigma B \mu$)

$$\operatorname{Cov}(SSR,SSE) =4\mu^T\mathbb H\Sigma (\mathbb I -\mathbb H)\mu + 2\operatorname{tr}(\mathbb H\Sigma (\mathbb I -\mathbb H)\Sigma)\\ $$

Since $\Sigma=\sigma^2 \mathbb I$ is constant diagonal:

$$\operatorname{Cov}(SSR,SSE) =4\sigma^2\mu^T\mathbb H (\mathbb I -\mathbb H)\mu + 2\sigma^4\operatorname{tr}(\mathbb H (\mathbb I -\mathbb H))\\ =4\sigma^2\mu^T (\mathbb H -\mathbb H^2)\mu +2\sigma^4\operatorname{tr}(\mathbb H -\mathbb H^2)\\ =0$$

Firebug
  • 15,262
  • 5
  • 60
  • 127
  • Very elegant proof. I like it but I am not clear on something and not convinced for the case when x is non-random. As I see it, key to this proof is the assumption that $y$~$N(0,\sum)$. Given that the dependent variable in our sample is what it is and its population average is unknown, what does it mean to assume that the random vector y is multivariate normally distributed with mean 0 and constant variance. Don't we make assumptions only about the distribution of the error term and derive the distribution of the random vector y? – ColorStatistics Feb 19 '21 at 16:21
  • When x is non-random, the random vector y will, by design, not have mean 0. In fact the $y_i$ will not all have the same mean. Would you disagree? – ColorStatistics Feb 19 '21 at 16:21
  • @ColorStatistics the zero mean part is not that important, it's just to simplify the expressions, but the result would be the same (a mean term would appear in the Cov expression, but then it would be nullified by H and H²) – Firebug Feb 19 '21 at 16:24
  • @ColorStatistics what do you mean by non-random y? With covariance zero then it's even simpler – Firebug Feb 19 '21 at 16:24
  • I see. Thank you for the clarification. I am clear on the case when X is random. But for x non-random, the $y_i$ will have different means so we need a different approach. See this answer on the fact that $y_i$ will have different means when x non-random. https://stats.stackexchange.com/a/308493/198058 – ColorStatistics Feb 19 '21 at 16:30
  • @ColorStatistics see the updated answer. As I said, the mean of $y$ makes no difference. And, if by non-random you mean zero (co)variance ($\sigma^2 = 0$), then the proof is even easier. – Firebug Feb 19 '21 at 16:42
  • By x non-random I mean x as in an experiment where you pick the values of x. So the independent variables would not be random variables and thus there would not be a covariance of these to speak of. Can you either hint how to prove it in that case or add something along those lines to your answer and I'll mark this answer as complete. Thank you for this great answer. – ColorStatistics Feb 19 '21 at 16:47
  • @ColorStatistics the fact that X is non-random does not mean that y won't be. You can repeat the same experiment twice, and get different results. Also, the covariance of X does not enter the equation (well, it's implied in the Hat matrix, but that will still exist regardless of the fact that you chose X), so it does not change the result. – Firebug Feb 19 '21 at 16:57
  • I agree. I don't believe I ever said "non-random y" and certainly didn't intend to; only "non-random x". When x is non-random, see the link above which shows that the $y_i$ will have different means, therefore I think y cannot be multivariate normally distributed with a constant mean. – ColorStatistics Feb 19 '21 at 17:00
  • @ColorStatistics I misunderstood what you were saying. y are Gaussian distributed (or at least we assume they are), with respective expected values $\hat y = \beta X$, and a diagonal covariance (because they are independent of each other). I don't violate any of these in my answer as far as I can see. – Firebug Feb 19 '21 at 17:05
  • I'll digest this. On an unrelated note: I think $SSR=y^THy-n\mu^2 $ – ColorStatistics Feb 19 '21 at 17:21
  • @ColorStatistics just corrected it – Firebug Mar 05 '21 at 23:52
  • 1
    Hi, nice post. Just a quick comment. That $Hy$ and $(I-H)y$ are uncorrelated follows immediately since $H$ and $I-H$ are orthogonal. Due to joint normality, we know $Hy$ and $(I-H)y$ are independent. After that, we know that any functions of those are independent, including their sums of squares, giving the result. – user257566 Apr 09 '21 at 21:25
  • @user257566 Hi, thanks! True, that (I allude to the fact in [my site](https://bhvieira.github.io/blog/2021/03/HatMatrix/)). But the question asked for a proof, I thought it was better to write it all. – Firebug Apr 09 '21 at 21:33
  • I see, thanks for explaining. FWIW, your post didn't prove that the sums of squares were independent, just that they were uncorrelated. Am I misreading? – user257566 Apr 09 '21 at 21:34
  • 1
    @user257566 No, you got it right. Now that you mentioned it, I had it in my mind that uncorrelated Chi-square variables are independent, though I'm not so sure now. I'll have to check though their relationship to squared Normals probably helps. Perhaps it brings back to your initial assessment. – Firebug Apr 10 '21 at 01:28