6

I was trying to derive the equations from page 109 in "elements of statistical learning" (image below) enter image description here

To be honest, I am not sure how the covariance $\Sigma$ is estimated (the third bullet point in image). Can someone kindly show me and how $\hat{\Sigma}$ is derived (in particular I am not sure how $N - K$ appear in the denominator) ? Thanks

UPDATE: I still cannot figure why $N-K$ (where $K$ is the number of parameters) shows up in the denominator, but I suspect it has to do with making the covariance matrix estimator unbiased. This is similar to $\frac{1}{N} \sum_i (x_i - \bar{x})^2$ is biased, while $s^2 = \frac{1}{N-1} \sum_i (x_i - \bar{x})^2$ is not. Please CORRECT ME IF I AM WRONG.

ttnphns
  • 51,648
  • 40
  • 253
  • 462
mynameisJEFF
  • 1,583
  • 4
  • 24
  • 29
  • Yes, N-K (K is the number of classes) is the proper degrees of freedom. You see the same in ANOVA. But why do you call Sigma_hat covariance? In this instance, it is variance. – ttnphns Mar 19 '14 at 20:46
  • @ttnphns: In this quote it *is* a covariance matrix (because $x_i$ are vectors). – amoeba Mar 20 '14 at 14:39
  • http://stats.stackexchange.com/questions/3931/intuitive-explanation-for-dividing-by-n-1-when-calculating-standard-deviation/169126#169126 but you have estimated $K$ means –  Feb 16 '16 at 06:05

1 Answers1

7

You are right. The equation for the shared variance-covariance matrix comes from Pooled Variance

The shared covariance matrix $\Sigma$ is taken as a weighted average of individual covariance matrices, weigted by $n_i-1$

So $\Sigma = \frac{(n_1-1) * \Sigma_1 + (n_2-1) * \Sigma_2 + .... + (n_k-1) * \Sigma_k}{(n_1-1)+(n_2-1)+...+(n_k-1)}$

Binu Jasim
  • 301
  • 3
  • 6