Meaning of PCA-based distances

Asked Oct 19 '21 at 15:31

Active Oct 19 '21 at 15:53

Viewed 51 times

Given that principal components analysis (PCA) have been performed on a dataset, the question is what does the distances between the coordinates of the data points in the transformed (PCA) space mean? For instance, if Euclidean distance is calculated between point A and point B on the PCA coordinates is this perhaps equivalent to Mahalanobis distance between point A and point B with the original coordinates?

NOTE: assuming that the data is scaled as part of the PCA analysis. The code in R to perform the PCA step would look like this (where df is the dataframe containing the original records)

pca <- prcomp(df, scale=TRUE)

edited Oct 19 '21 at 15:53

asked Oct 19 '21 at 15:31

dmb

2

depending on how you applied PCA (corr or cov) the distances can correspond to Mahalanobis distance – Aksakal Oct 19 '21 at 16:00
PCA on correlation (the data is scaled) – dmb Oct 19 '21 at 17:32
I think all of the facts you need to demonstrate this relationship are in https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca -- the rest is just substitutions and algebra. – Sycorax Oct 20 '21 at 15:56

Meaning of PCA-based distances

0 Answers0