Maximum likelihood: Why is the number of non-zero eigenvalues equal to $x^T \hat{\Sigma}^{-1} x$

Question

I've been reading this code (based on this R package) and I found that the number of non-zero eigenvalues of the estimated covariance is roughly equal to $x_i^T \hat{\Sigma}^{-1} x_i$. I want to know how to arrive at this result.

This arises in the context of Maximum Likelihood estimation of generalized ARMA coefficients. I've made a few tests with data generated from a multivariate normal distribution with random covariance matrices and the results are consistent.

It seems I'm lacking of a little bit of linear algebra

Some background:

$x_i$ is a real column vector with dimension $d$ (one observation)
$X = [x_1, x_2, \ldots, x_n]$ with shape $d\times n$ (all the observations)

I want to prove that:
$$\sum_{i=0}^{i=n} x_i^T \hat{\Sigma}^{-1} x_i = n\operatorname{len}(s)$$ being $\operatorname{len}(s)$ the number of non-zero singular values* of $\hat{\Sigma}$, that is defined as
$$\hat{\Sigma} = \frac{1}{n} \sum_{j=1}^{j=n} x_j x_j^T$$

If necessary, mean can be considered $0$

*Not necessarily mathematical $0$, it can also mean "not too small values" .
Actually, "non-zero" means "non-negligible" depending on a threshold defined as the largest singular value times the square root of the machine epsilon.
In Python: s[0] * np.sqrt(np.finfo(np.float).eps) being s the singular values in descending order (see the code)

Are you sure $\Sigma$ and $x_i$ are related? If so, then you are looking at fourth-order statistic. In general, we replace $\Sigma$ with $\hat{\Sigma}$ and consider it as a constant matrix, and then apply quadratic forms of random variables. — Maxtron, Nov 20 '18 at 02:03
As often happens here, you're using the word "sample" incorrectly. The vector $x_i$, with $d$ scalar entries, is not one sample; it is one observation within a sample. The whole sequence $x_1,\ldots,x_n,$ with $n\cdot d$ scalar entries, is the whole sample. Just one whole sample. — Michael Hardy, Nov 20 '18 at 03:58
In order to make any sense of this question you need to find a reasonable definition of $\Sigma^{-1}$ for singular $\Sigma.$ In exploring that issue you should quickly understand what's going on. Also, if you consider for a moment the case $d=1,$ you will see your result cannot possibly be generally true: it looks like you are assuming the rows of $x$ have been centered. — whuber, Nov 20 '18 at 13:53
Thanks for the helpful comments. @Maxtron, indeed, $\Sigma$ as defined above is $\hat{\Sigma}$. Sorry for the notation, fixed now. — Franco Marchesoni, Nov 20 '18 at 15:17
@MichaelHardy , thanks for the correction. Probably this kind of mistakes are due to translations from Spanish (as in my case). Edited too — Franco Marchesoni, Nov 20 '18 at 15:21
You lost the $^{-1}$ in the edits: that's an essential part of the formula, regardless of the language. And, once again, I would like to emphasize that the assumption of mean-centered data is crucial for it to be correct. — whuber, Nov 20 '18 at 15:22
@whuber Thanks for pointing that out. It could be the case that the answer implies just a simple definition. If that's the definition I'm looking for, could you please write it down or point to where it's written? — Franco Marchesoni, Nov 20 '18 at 15:26
@whuber in the question I wrote "mean can be considered 0". So the observations are centered. And it can be shown that this works with d=1 — Franco Marchesoni, Nov 20 '18 at 15:31
"Can be considered" is too indefinite: it is *required* that the means all be zero. If you followed my explanation at https://stats.stackexchange.com/a/62147/919, you will find the result to be intuitively obvious and you will have a guide to how to prove it mathematically. It's just a $d$ dimensional generalization of the fact that the mean squared z-score of a dataset must be $1.$ — whuber, Nov 20 '18 at 15:33
If necessary! I didn't know it was in fact necessary, so I let the kind people answering here explore this fact and take the correct approach, like you did. Thanks for the responses @whuber ! I will read your post and ask if I have any doubt — Franco Marchesoni, Nov 20 '18 at 15:38

Maximum likelihood: Why is the number of non-zero eigenvalues equal to $x^T \hat{\Sigma}^{-1} x$

0 Answers0