Questions tagged [pca]

Principal component analysis (PCA) is a linear dimensionality reduction technique. It reduces a multivariate dataset to a smaller set of constructed variables preserving as much information (as much variance) as possible. These variables, called principal components, are linear combinations of the input variables.

Principal component analysis is a technique to decompose an array of numerical data into a set of orthogonal vectors (uncorrelated linear combinations of the variables) called principal components. The first few principal components often suffice to grasp nearly all the multivariate variability of the data; therefore PCA is one of the data reduction / dimensionality reduction methods.

3190 questions

1229

votes

27 answers

Making sense of principal component analysis, eigenvectors & eigenvalues

In today's pattern recognition class my professor talked about PCA, eigenvectors and eigenvalues. I understood the mathematics of it. If I'm asked to find eigenvalues etc. I'll do it correctly like a machine. But I didn't understand it. I didn't…

asked Sep 15 '10 at 20:05

claws

12,575
3
15
10

516

votes

3 answers

Relationship between SVD and PCA. How to use SVD to perform PCA?

Principal component analysis (PCA) is usually explained via an eigen-decomposition of the covariance matrix. However, it can also be performed via singular value decomposition (SVD) of the data matrix $\mathbf X$. How does it work? What is the…

pca dimensionality-reduction matrix svd faq

asked Jan 20 '15 at 23:47

amoeba

93,463
28
275
317

252

votes

15 answers

What are the differences between Factor Analysis and Principal Component Analysis?

It seems that a number of the statistical packages that I use wrap these two concepts together. However, I'm wondering if there are different assumptions or data 'formalities' that must be true to use one over the other. A real example would be…

pca factor-analysis

asked Aug 12 '10 at 03:46

Brandon Bertelsen

6,672
9
35
46

201

votes

6 answers

Can principal component analysis be applied to datasets containing a mix of continuous and categorical variables?

I have a dataset that has both continuous and categorical data. I am analyzing by using PCA and am wondering if it is fine to include the categorical variables as a part of the analysis. My understanding is that PCA can only be applied to continuous…

categorical-data pca correspondence-analysis mixed-type-data

asked Dec 28 '10 at 03:47

Nikolina Icitovic

2,011
3
13
4

196

votes

7 answers

PCA on correlation or covariance?

What are the main differences between performing principal component analysis (PCA) on the correlation matrix and on the covariance matrix? Do they give the same results?

correlation pca covariance factor-analysis

asked Jul 19 '10 at 19:39

Random

2,140
3
13
8

159

votes

5 answers

What's the difference between principal component analysis and multidimensional scaling?

How are PCA and classical MDS different? How about MDS versus nonmetric MDS? Is there a time when you would prefer one over the other? How do the interpretations differ?

pca multidimensional-scaling pcoa

asked Aug 08 '11 at 19:52

Stephen Turner

4,183
8
27
33

156

votes

1 answer

How to reverse PCA and reconstruct original variables from several principal components?

Principal component analysis (PCA) can be used for dimensionality reduction. After such dimensionality reduction is performed, how can one approximately reconstruct the original variables/features from a small number of principal…

pca dimensionality-reduction svd

asked Aug 09 '16 at 23:52

amoeba

93,463
28
275
317

136

votes

6 answers

Should one remove highly correlated variables before doing PCA?

I'm reading a paper where author discards several variables due to high correlation to other variables before doing PCA. The total number of variables is around 20. Does this give any benefits? It looks like an overhead to me as PCA should handle…

correlation pca

asked Feb 21 '13 at 16:41

type2

1,471
3
10
4

118

votes

4 answers

PCA and proportion of variance explained

In general, what is meant by saying that the fraction $x$ of the variance in an analysis like PCA is explained by the first principal component? Can someone explain this intuitively but also give a precise mathematical definition of what "variance…

regression pca linear-model dimensionality-reduction

asked Feb 10 '12 at 05:36

user9097

2,973
7
18
11

103

votes

2 answers

Why do we need to normalize data before principal component analysis (PCA)?

I'm doing principal component analysis on my dataset and my professor told me that I should normalize the data before doing the analysis. Why? What would happen If I did PCA without normalization? Why do we normalize data in general? Could…

pca normalization dimensionality-reduction

asked Sep 04 '13 at 08:12

jjepsuomi

5,207
11
34
47

votes

5 answers

What is the relation between k-means clustering and PCA?

It is a common practice to apply PCA (principal component analysis) before a clustering algorithm (such as k-means). It is believed that it improves the clustering results in practice (noise reduction). However I am interested in a comparative and…

clustering pca k-means

asked Nov 23 '15 at 22:42

mic

3,848
3
23
38

votes

5 answers

Loadings vs eigenvectors in PCA: when to use one or another?

In principal component analysis (PCA), we get eigenvectors (unit vectors) and eigenvalues. Now, let us define loadings as $$\text{Loadings} = \text{Eigenvectors} \cdot \sqrt{\text{Eigenvalues}}.$$ I know that eigenvectors are just directions and…

pca

asked Mar 29 '15 at 09:23

user2696565

1,239
1
10
14

votes

7 answers

What are principal component scores?

What are principal component scores (PC scores, PCA scores)?

pca definition

asked Jul 20 '10 at 05:37

vrish88

1,143
1
9
8

votes

4 answers

What're the differences between PCA and autoencoder?

Both PCA and autoencoder can do demension reduction, so what are the difference between them? In what situation I should use one over another?

machine-learning pca neural-networks autoencoders

asked Oct 15 '14 at 07:26

RockTheStar

11,277
31
63
89

votes

4 answers

How to visualize what canonical correlation analysis does (in comparison to what principal component analysis does)?

Canonical correlation analysis (CCA) is a technique related to principal component analysis (PCA). While it is easy to teach PCA or linear regression using a scatter plot (see a few thousand examples on google image search), I have not seen a…

regression data-visualization pca canonical-correlation geometry

asked Jul 26 '13 at 20:28

figure

2 3

…

99 100 Next