2

I was reading about deep metric learning (from here) and came across the mahalanobis distance. I understood why we can not use euclidean distance if the distribution is not isotropic (the covariance between dimensions of our data is not 0, so they are not independent).

We want to take into account the covariance of our distributions. Besides we consider that both data (which we want to measure the distance between them) came from same distribution. Therefore, the general formula to measure the mahalanobis distance between x and y is as follows:

$d(x,y) = (x-y)^TC(x-y)$

with $C$ is equal to the covariance matrix. Consider that $x$ and $y$ are $d$ dimensional, therefore $C$ is a $dxd$ matrix. It is okay for me up to this point. But my question is, how this covariance matrix can be decomposed as $W^TW$. They state it in here.

My question: Why the covariance matrix in mahalonabis distance is equal to $W^TW$. Suppose that my x and y are the final feature vectors of a siamese network. Therefore, I assume that the $W$ is the weight matrix of the last layer of the network. How does $W^TW$ give the covariance matrix of the last layer's feature vectors' distribution?

Mas A
  • 175
  • 9
  • Please tell us what "$W$" refers to. Evidently it is a matrix with centered columns, but we can only guess that. – whuber Jan 09 '22 at 20:57

0 Answers0