Is it true that, for a matrix $A \in \mathbb{R}^{n \times m}$ with $n<m$ (so with more features than samples), its covariance matrix is (or might be?) not positive semidefinite? If that's the case, can someone explain and prove it?
Asked
Active
Viewed 112 times
0
-
I've already seen that answer but I'm not able to generalize it to my case – crash Oct 24 '18 at 20:47
-
4I'm pretty sure this question has been asked and answered, so I hope we can identify the duplicate. But in the meantime, note that a basic fact of linear algebra is that every matrix $B$ with more columns than rows has a nontrivial *kernel,* defined as the subspace of vectors $x$ it sends to zero: that is, $Bx=0$ for some nonzero $x.$ Consequently $x^\prime B^\prime B x=0,$ proving $B^\prime B$ is not definite. When $B$ is the centered version of $A,$ $B^\prime B$ is its covariance matrix. – whuber Oct 24 '18 at 21:02
-
Thanks whuber, that makes sense. It's not so obvious to me that "every matrix B with more columns than rows has a nontrivial kernel", do you have some additional info/references regarding this statement? – crash Oct 24 '18 at 21:18
-
1I think the OP has a point in that the proposed duplicate is not explicit about rank-deficient matrices. Someone who knows more about linear algebra than me could expand on @whuber comment so we do have an answered question. – mdewey Oct 25 '18 at 09:20
-
2There are many ways to demonstrate the basic premise I asserted. One follows easily by counting dimensions, because the dimension of the kernel cannot be any less than the dimension of the domain $(m)$ minus that of the image $(n).$ See https://en.wikipedia.org/wiki/Rank%E2%80%93nullity_theorem. – whuber Oct 25 '18 at 12:05
-
A lot of relevant info, maybe the dup, can be found in [this list](https://stats.stackexchange.com/search?q=rank+defic*+definit*) – kjetil b halvorsen May 12 '19 at 11:37