Let's say $\vec{x}$ is an $n$ dimensional observation, $\vec{\mu}$ the $n$ dimensional mean of the sample that $\vec{x}$ is from and $\Sigma$ the $n \times n$ covariance matrix of that sample.
Then the mahalanobis distance is defined like this $$ mahal(\vec{x}) = \sqrt{(\vec{x} - \vec{\mu})^T \Sigma^{-1} (\vec{x} - \vec{\mu})} $$ Intuitively this function does this
- Center $\vec{x}$ around the mean $\vec{\mu}$
- "Remove" the covariance and variance integral to $\vec{x}$ with $\Sigma^{-1}$
- Compute dot product between the left $(\vec{x} - \vec{\mu})$ and the right transformed $(\vec{x} - \vec{\mu})$
- Compute the square root of the dot product from 3. which amounts to the euclidean distance
The problem I see is that it only "removes" the covariance from the right $(\vec{x} - \vec{\mu})$ but not the left. Which means that this isn't the euclidean distance in the space with base $\Sigma$ since the left hand side didn't change base to $\Sigma$ like the right hand side.
As I see it the the mahalanobis distance would have to be defined like this for it to be like the euclidean distance $$ mahal'(\vec{x}) = \sqrt{(\Sigma^{-1} (\vec{x} - \vec{\mu}))^T (\Sigma^{-1} (\vec{x} - \vec{\mu}))} $$
I also tried to decompose $\Sigma = E \Lambda E^{-1}$ where $E$ are eigenvectors of $\Sigma$ and $\Lambda$ its eigenvalues. Then the mahalanobis distance could be written like $$ \begin{align} mahal(\vec{x}) &= \sqrt{(\vec{x} - \vec{\mu})^T \Sigma^{-1} (\vec{x} - \vec{\mu})} \\ &= \sqrt{(E^{-1} (\vec{x} - \vec{\mu}))^T \Lambda^{-1} (E^{-1} (\vec{x} - \vec{\mu}))} \end{align} $$ This fits my intuition of euclidean distance much more, except the right hand side is scaled by $\Lambda^{-1}$ which makes it not the same as euclidean distance.