1

Related to this question: Maximum number of principal components in PCA. Is sklearn wrong?

If n_samples < n_features, PCA should only returns n_samples - 1 directions. However, sklearn always returns n_components given by min(n_samples, n_features). In the comments to the question above it was claimed that the last component should be trivially zero. However, per sklearn's implementation, this is only true for the training dataset. For any new samples, the last component will not be zero:

from sklearn.decomposition import PCA
import numpy as np

pca = PCA()
train = np.random.rand(5,10)
pca.fit(train)
print pca.transform(train)[:,-1] ##output = [  6.70056333e-17  -9.24789628e-18   1.18730019e-15  -3.71242110e-16 -4.70382051e-16] 
print pca.transform(np.random.rand(5,10))[:,-1] ## output = [-0.12061904 -0.33477243 -0.29965447  0.65033472 -0.05476772]

My question is now: what is the last direction generated from sklearn's PCA, when n_samples < n_features?

amoeba
  • 93,463
  • 28
  • 275
  • 317
  • 2
    What is trivially zero is the $n$th eigenvalue and the $n$th PC scores (as you observe on the training set). The *eigenvector* is non-zero and so test-set projection will be non-zero too. What is the value of this eigenvector is a good question. I think it's an arbitrary vector orthogonal to the $n-1$ "nontrivial" eigenvectors (with nonzero eigenvalues). – amoeba Oct 30 '18 at 12:58
  • 2
    @Amoeba That's right. When computing with perfect mathematical precision, the last eigenvalue will be $0$ with positive multiplicity. When the multiplicity exceeds $1,$ any orthogonal basis of the eigenspace may be chosen. When using floating point computations, the software might find a collection of tiny non-zero eigenvalues and associate with them a corresponding set of eigenvectors: it is, in effect, fitting *floating point "noise."* It's hard to conceive of any more detailed answer to this vague question, unless one were to undertake a numerical analysis of the software itself. – whuber Oct 30 '18 at 13:31
  • Thanks guys. I guessed it has to be in the subspace orthogonal to the n-1 'nontrivial' eigenvector, but find it weird for an algorithm to produce 'arbitrary' direction. It makes sense for the algorithm to find a specific direction due to floating point error. – lizardfireman Nov 01 '18 at 06:00

0 Answers0