Questions tagged [eigenvalues]

For questions involving calculation or interpretation of eigenvalues or eigenvectors.

Questions involving the calculation or interpretation of eigenvalues should use . This may include factor analysis, principal components analysis or regression, or other model estimation functions that require a positive definite matrix (of which all eigenvalues are positive). In factor analysis, a factor's eigenvalue $=\sum($loadings on that factor$)^2$; eigenvalue $\div\sum$ eigenvalues $=$ % total variance explained by the factor.

From Wikipedia:

An eigenvector of a square matrix A is a non-zero vector v that, when the matrix is multiplied by v, yields a constant multiple of v, the multiplier being commonly denoted by λ. That is:

A v = λ v

The number λ is called the eigenvalue of A corresponding to v.

In the context of factor analysis, a factor's eigenvalue is the sum of all variables' squared loadings on that factor. The factor loading is the correlation of the variable with the factor. The squared loading is the variance explained in the variable by the factor. The factor's eigenvalue divided by the sum of all eigenvalues is the proportion of total variance explained by the factor. From Wikipedia:

If a factor has a low eigenvalue, then it contributes little to the explanation of variances in the variables and may be ignored as redundant with more important factors.

All of the above is also true of principal components analysis. Principal components regression eliminates components with small eigenvalues for the similar purpose of reducing the dimensionality of a set of regressors.

Many criteria for identifying an appropriate threshold exist, and vary in their utility.

397 questions
1229
votes
27 answers

Making sense of principal component analysis, eigenvectors & eigenvalues

In today's pattern recognition class my professor talked about PCA, eigenvectors and eigenvalues. I understood the mathematics of it. If I'm asked to find eigenvalues etc. I'll do it correctly like a machine. But I didn't understand it. I didn't…
claws
  • 12,575
  • 3
  • 15
  • 10
51
votes
3 answers

How does centering make a difference in PCA (for SVD and eigen decomposition)?

What difference does centering (or de-meaning) your data make for PCA? I've heard that it makes the maths easier or that it prevents the first PC from being dominated by the variables' means, but I feel like I haven't been able to firmly grasp the…
Zenit
  • 1,586
  • 2
  • 17
  • 19
50
votes
3 answers

Why does correlation matrix need to be positive semi-definite and what does it mean to be or not to be positive semi-definite?

I have been researching the meaning of positive semi-definite property of correlation or covariance matrices. I am looking for any information on Definition of positive semi-definiteness; Its important properties, practical implications; The…
45
votes
7 answers

Why does Andrew Ng prefer to use SVD and not EIG of covariance matrix to do PCA?

I am studying PCA from Andrew Ng's Coursera course and other materials. In the Stanford NLP course cs224n's first assignment, and in the lecture video from Andrew Ng, they do singular value decomposition instead of eigenvector decomposition of…
DongukJu
  • 593
  • 1
  • 5
  • 5
33
votes
1 answer

If I generate a random symmetric matrix, what's the chance it is positive definite?

I got a strange question when I was experimenting some convex optimizations. The question is: Suppose I randomly (say standard normal distribution) generate a $N \times N$ symmetric matrix, (for example, I generate upper triangular matrix, and fill…
Haitao Du
  • 32,885
  • 17
  • 118
  • 213
29
votes
2 answers

Why are there only $n-1$ principal components for $n$ data if the number of dimensions is $\ge n$?

In PCA, when the number of dimensions $d$ is greater than (or even equal to) the number of samples $N$, why is it that you will have at most $N-1$ non-zero eigenvectors? In other words, the rank of the covariance matrix amongst the $d\ge N$…
GrokingPCA
  • 391
  • 1
  • 3
  • 4
17
votes
1 answer

What is the meaning of the eigenvectors of a mutual information matrix?

When looking at the eigenvectors of the covariance matrix, we get the directions of maximum variance (the first eigenvector is the direction in which the data varies the most, etc.); this is called principal component analysis (PCA). I was wondering…
kmace
  • 737
  • 4
  • 17
15
votes
1 answer

What is principal subspace in probabilistic PCA?

if $X$ is observed data matrix and $Y$ is latent variable then $$X=WY+\mu+\epsilon$$ Where $\mu$ is the mean of observed data, and $\epsilon$ is the Gaussian error/noise in data, and $W$ is called principal subspace. My question is when normal PCA…
user3086871
  • 583
  • 4
  • 11
15
votes
2 answers

Why does PCA maximize total variance of the projection?

Christopher Bishop writes in his book Pattern Recognition and Machine Learning a proof, that each consecutive principal component maximizes the variance of the projection to one dimension, after the data has been projected to orthogonal space to the…
michal
  • 1,138
  • 3
  • 11
  • 14
13
votes
1 answer

Why eigenvectors reveal the groups in Spectral Clustering

According to Handbook of Cluster Analysis Spectral Clustering is done with following algorithm: Input Similarity Matrix $S$, number of clusters $K$ Form the transition matrix $P$ with $P_{ij} = S_{ij} / d_i$ for $i,j = 1:n$ where $d_i= \sum_{j=1}^n…
jakes
  • 177
  • 1
  • 12
13
votes
1 answer

Explain how `eigen` helps inverting a matrix

My question relates to a computation technique exploited in geoR:::.negloglik.GRF or geoR:::solve.geoR. In a linear mixed model setup: $$ Y=X\beta+Zb+e $$ where $\beta$ and $b$ are the fixed and random effects respectively. Also,…
12
votes
1 answer

Why are eigen and svd decompositions of a covariance matrix based on sparse data yielding different results?

I am trying to decompose a covariance matrix based on a sparse / gappy data set. I'm noticing that the sum of lambda (explained variance), as calculated with svd, is being amplified with increasingly gappy data. Without gaps, svd and eigen yeild the…
Marc in the box
  • 3,532
  • 3
  • 33
  • 47
12
votes
2 answers

Why are PCA eigenvectors orthogonal and what is the relation to the PCA scores being uncorrelated?

I'm reading up on PCA, and I'm understanding most of what's going on in terms of the derivation apart from the assumption that eigenvectors need to be orthogonal and how it relates to the projections (PCA scores) being uncorrelated? I have two…
Pavan Sangha
  • 511
  • 4
  • 12
12
votes
3 answers

Is every correlation matrix positive definite?

I'm talking here about matrices of Pearson correlations. I've often heard it said that all correlation matrices must be positive semidefinite. My understanding is that positive definite matrices must have eigenvalues $> 0$, while positive…
11
votes
1 answer

A paper mentions a "Monte Carlo simulation to determine the number of principal components"; how does it work?

I'm doing a Matlab analysis on MRI data where I have performed PCA on a matrix sized 10304x236 where 10304 is the number of voxels (think of them as pixels) and 236 is the number of timepoints. The PCA gives me 236 Eigenvalues and their related…
chainhomelow
  • 359
  • 3
  • 10
1
2 3
26 27