-1

Can anyone explain about the proportion of variance explained in PCA and why it is important in the analysis of PCA?

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
srimaster
  • 155
  • 1
  • 2
  • 6

1 Answers1

1

For multivariate observations of potentionally correlated data in say n dimensions the principal components provide orthogonal variables up to n. The first principal component is in the direction of the largest spread or variance. The some of the variances in the n components is the total variance. The proportion of variance explained by the first r principal components provides the most variance for any r components. The percentage of variance explained by the first r principal components is just the total variance in the first r principal components divided by the total variance in all n principal components. This is important because a small number of principal components could explain a large portion of the total variance (say 80%) and so this can allow for dimensionality reduction to these r prinicpal components which are linear combinations of the original variables. This is also explained in a number of questions posed on this site including the one linked by David Kozak.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143