How is the mahalanobis distance like the euclidean distance?

Question

Let's say $\vec{x}$ is an $n$ dimensional observation, $\vec{\mu}$ the $n$ dimensional mean of the sample that $\vec{x}$ is from and $\Sigma$ the $n \times n$ covariance matrix of that sample.

Then the mahalanobis distance is defined like this $$ mahal(\vec{x}) = \sqrt{(\vec{x} - \vec{\mu})^T \Sigma^{-1} (\vec{x} - \vec{\mu})} $$ Intuitively this function does this

Center $\vec{x}$ around the mean $\vec{\mu}$
"Remove" the covariance and variance integral to $\vec{x}$ with $\Sigma^{-1}$
Compute dot product between the left $(\vec{x} - \vec{\mu})$ and the right transformed $(\vec{x} - \vec{\mu})$
Compute the square root of the dot product from 3. which amounts to the euclidean distance

The problem I see is that it only "removes" the covariance from the right $(\vec{x} - \vec{\mu})$ but not the left. Which means that this isn't the euclidean distance in the space with base $\Sigma$ since the left hand side didn't change base to $\Sigma$ like the right hand side.

As I see it the the mahalanobis distance would have to be defined like this for it to be like the euclidean distance $$ mahal'(\vec{x}) = \sqrt{(\Sigma^{-1} (\vec{x} - \vec{\mu}))^T (\Sigma^{-1} (\vec{x} - \vec{\mu}))} $$

I also tried to decompose $\Sigma = E \Lambda E^{-1}$ where $E$ are eigenvectors of $\Sigma$ and $\Lambda$ its eigenvalues. Then the mahalanobis distance could be written like $$ \begin{align} mahal(\vec{x}) &= \sqrt{(\vec{x} - \vec{\mu})^T \Sigma^{-1} (\vec{x} - \vec{\mu})} \\ &= \sqrt{(E^{-1} (\vec{x} - \vec{\mu}))^T \Lambda^{-1} (E^{-1} (\vec{x} - \vec{\mu}))} \end{align} $$ This fits my intuition of euclidean distance much more, except the right hand side is scaled by $\Lambda^{-1}$ which makes it not the same as euclidean distance.

How would you rewrite the diagonal matrix $\Lambda ^{-1}$ so that $z^\top \Lambda^{-1} z$ has the same rescaling applied to both $z^\top$ and $z$? That is, how can you rewrite this so that there is some matrix $A$ such that $z^\top A A z$? — Sycorax, Aug 13 '19 at 16:13
You could define $A$ as a diagonal matrix with entries $\sqrt{1 / \lambda_i}$, then $AA = \Lambda^{-1}$. This would also mean that we could transform $z^\top \Lambda^{-1} z = (A z)^\top (A z)$ and have both $z$ scaled by the standard deviation $1/\sqrt{\lambda}$? — TomTom, Aug 13 '19 at 16:34
Yes; $\Lambda ^{-1} = \Lambda ^{-\frac{1}{2}}\Lambda ^{-\frac{1}{2}}$. So if we replace $z$ with appropriate expressions from your derivation, we end up with the inner product of two vectors under a square root. — Sycorax, Aug 13 '19 at 16:41

How is the mahalanobis distance like the euclidean distance?

0 Answers0