Estimating the covariance matrix in linear discriminant analysis

Question

I was trying to derive the equations from page 109 in "elements of statistical learning" (image below) enter image description here

To be honest, I am not sure how the covariance $\Sigma$ is estimated (the third bullet point in image). Can someone kindly show me and how $\hat{\Sigma}$ is derived (in particular I am not sure how $N - K$ appear in the denominator) ? Thanks

UPDATE: I still cannot figure why $N-K$ (where $K$ is the number of parameters) shows up in the denominator, but I suspect it has to do with making the covariance matrix estimator unbiased. This is similar to $\frac{1}{N} \sum_i (x_i - \bar{x})^2$ is biased, while $s^2 = \frac{1}{N-1} \sum_i (x_i - \bar{x})^2$ is not. Please CORRECT ME IF I AM WRONG.

Yes, N-K (K is the number of classes) is the proper degrees of freedom. You see the same in ANOVA. But why do you call Sigma_hat covariance? In this instance, it is variance. — ttnphns, Mar 19 '14 at 20:46
@ttnphns: In this quote it *is* a covariance matrix (because $x_i$ are vectors). — amoeba, Mar 20 '14 at 14:39
http://stats.stackexchange.com/questions/3931/intuitive-explanation-for-dividing-by-n-1-when-calculating-standard-deviation/169126#169126 but you have estimated $K$ means — , Feb 16 '16 at 06:05

score 7 · Answer 1 · answered Feb 16 '16 at 04:53

You are right. The equation for the shared variance-covariance matrix comes from Pooled Variance

The shared covariance matrix $\Sigma$ is taken as a weighted average of individual covariance matrices, weigted by $n_i-1$

So $\Sigma = \frac{(n_1-1) * \Sigma_1 + (n_2-1) * \Sigma_2 + .... + (n_k-1) * \Sigma_k}{(n_1-1)+(n_2-1)+...+(n_k-1)}$

Estimating the covariance matrix in linear discriminant analysis

1 Answers1