Covariance Matrix Decomposition - Data Decorrelation

Question

So I recently found out about Mahalanobis distance. Given a r.v $x$ in N-dimensional space, an associated metric is defined by

$$M(x) = \sqrt{(x-\mu)^T S^{-1}(x-\mu)}$$

where $\mu$ and $S$ are mean and covariance of $x$. Further, given two random variables $x$ and $y$ which are assumed to be i.i.d, one can define the associated distance.

$$||x_1 - x_2||_M=M(x_1-x_2)$$

This operation seems cool because it decorrelates data before computing the metric. So I was thinking - is there such a thing as decorrelating the data in general? Namely, I would like to transform my data into an uncorrelated form

$$y = A(x-\mu)$$

for which the covariance is just an identity matrix

$$cov(y) = I$$

and Mahalanobis metric is the same as euclidean metric

$$M(y) = ||y||_2$$

In order to achieve this, I need to find $A$ such that

$$A^TA=S^{-1}$$

Questions:

Have people tried this before? Under what name is this procedure known in the literature?
I realise that $A$ is not uniquely defined. What is a good way to define $A$? Is Cholesky decomposition a good way?

Note:

I have previously asked this question on MATH SE, but I guess I might get more background info on this method here

This is not a general answer, but it seems to be closely related to data whitening: https://en.wikipedia.org/wiki/Whitening_transformation. This Wikipedia entry also addresses the nonuniqueness of the linear transformation. — Christopher Krapu, Oct 26 '20 at 14:54
https://stats.stackexchange.com/a/62147/919 provides a geometric answer to your questions. — whuber, Oct 26 '20 at 19:02
@whuber thanks, that looks like a nice post, I will read. I already realised there is a strong link with PCA, so I suspect A is simply the PCA transform matrix, followed by normalizing each PCA to have unit variance — Aleksejs Fomins, Oct 26 '20 at 20:25

Covariance Matrix Decomposition - Data Decorrelation

0 Answers0