Why is the covariance matrix inverted in the definition of the Mahalanobis distance?

Question

I'm on my first course on data science, and I encountered the Mahalanobis distance for the first time. It was mentioned that intuitively, what it does is that it corrects for the fact that some attributes in a dataset might be correlated (so in effect, correlated attributes might "count twice" if we were to use the Euclidean distance).

From this, it makes sense why it would use the covariance matrix, as it contains the information on which columns correlate with each other and how much, but I am not sure why one would need to invert it in this case.

So why is it inverted in the definition? Would a "non-inverted" version of the Mahalanobis distance not be a useful distance metric? If so, why?

inversion is a vector version of division, and you divide a distance by a unit of distance(kilometer/yards/light-years etc) — seanv507, Nov 06 '21 at 15:44
In one dimension if you measured how many standard deviations $x$ was from the mean $\mu$ of a distribution with variance $\sigma^2$, you would get $\Big|\frac{x-\mu}{\sigma}\Big|$ which you could rewrite as $\sqrt{(x-\mu)(\sigma^2)^{-1}(x-\mu)}$. Taking this to a multivariate case with vectors and matrices and using the covariance matrix $\Sigma$ instead of $\sigma^2$, you get a similar $\sqrt{(\mathbf x-\mathbf \mu)^T\Sigma^{-1}(\mathbf x-\mathbf \mu)}$, i.e Mahalanobis distance using the inverse — Henry, Nov 06 '21 at 17:04

Why is the covariance matrix inverted in the definition of the Mahalanobis distance?

0 Answers0