Questions tagged [metric]

A metric is a function that outputs a distance between 2 elements of a set & meets certain strict criteria (some 'distance' functions are not metrics).

A metric is a function that outputs a distance between 2 elements of a set. To meet the definition of a metric, a distance function must fulfill the following criteria:

  1. There is no distance between an element and itself: $d(x_i,x_i)=0$.
  2. If the distance between two elements is $0$, those elements are equivalent: $d(x_i,x_j)=0\implies x_i=x_j$.
  3. All distances are non-negative: $d(x_i,x_j)\ge0$.
  4. The distance between two elements is the same in either direction: $d(x_i,x_j)=d(x_j,x_i)$.
  5. The distance between two elements is less than or equal to the sum of the distances between those elements and a third: $d(x_i,x_j)\le d(x_i,x_k)+d(x_j,x_k)$
343 questions
328
votes
8 answers

Why is Euclidean distance not a good metric in high dimensions?

I read that 'Euclidean distance is not a good distance in high dimensions'. I guess this statement has something to do with the curse of dimensionality, but what exactly? Besides, what is 'high dimensions'? I have been applying hierarchical…
60
votes
5 answers

What is the advantages of Wasserstein metric compared to Kullback-Leibler divergence?

What is the practical difference between Wasserstein metric and Kullback-Leibler divergence? Wasserstein metric is also referred to as Earth mover's distance. From Wikipedia: Wasserstein (or Vaserstein) metric is a distance function defined between…
50
votes
6 answers

Percentage of overlapping regions of two normal distributions

I was wondering, given two normal distributions with $\sigma_1,\ \mu_1$ and $\sigma_2, \ \mu_2$ how can I calculate the percentage of overlapping regions of two distributions? I suppose this problem has a specific name, are you aware of any…
Ali Salehi
  • 603
  • 1
  • 6
  • 5
40
votes
4 answers

Recall and precision in classification

I read some definitions of recall and precision, though it is every time in the context of information retrieval. I was wondering if someone could explain this a bit more in a classification context and maybe illustrate some examples. Say for…
Olivier_s_j
  • 1,055
  • 2
  • 11
  • 25
28
votes
5 answers

Loss function and evaluation metric

When building a learning algorithm we are looking to maximize a given evaluation metric (say accuracy), but the algorithm will try to optimize a different loss function during learning (say MSE/entropy). Why are the evaluation metrics not used as…
Jesús Ros
  • 408
  • 1
  • 4
  • 10
23
votes
5 answers

How to control the cost of misclassification in Random Forests?

Is it possible to control the cost of misclassification in the R package randomForest? In my own work false negatives (e.g., missing in error that a person may have a disease) are far more costly than false positives. The package rpart allows the…
user5944
  • 231
  • 1
  • 2
  • 3
22
votes
2 answers

Comparing clusterings: Rand Index vs Variation of Information

I was wondering if anybody had any insight or intuition behind the difference between the Variation of Information and the Rand Index for comparing clusterings. I have read the paper "Comparing Clusterings - An Information Based Distance" by Marina…
Amelio Vazquez-Reina
  • 17,546
  • 26
  • 74
  • 110
19
votes
3 answers

Jensen Shannon Divergence vs Kullback-Leibler Divergence?

I know that KL Divergence is not symmetric and it cannot be strictly considered as a metric. If so, why is it used when JS Divergence satisfies the required properties for a metric? Are there scenarios where KL divergence can be used but not JS…
user2761431
  • 309
  • 1
  • 2
  • 4
14
votes
2 answers

How do you compare two Gaussian Processes?

Kullback-Leibler divergence is a metric to compare two probability density functions, but what metric is used to compare two GP's $X$ and $Y$?
pushkar
  • 169
  • 1
  • 6
13
votes
4 answers

Is there any probability distance that preserves all properties of a metric?

In studying Kullback–Leibler distance, there are two things we learn very quickly is that it does not respect neither the triangle inequality nor the symmetry, required properties of a metric. My question is whether there is any metric of…
Jorge Leitao
  • 1,219
  • 10
  • 24
13
votes
4 answers

Is triangle inequality fulfilled for these correlation-based distances?

For hierarchical clustering I often see the following two "metrics" (they aren't exactly speaking) for measuring the distance between two random variables $X$ and $Y$: $\newcommand{\Cor}{\mathrm{Cor}}$ \begin{align} d_1(X,Y) &= 1-|\Cor(X,Y)|, …
Linda
  • 133
  • 1
  • 4
12
votes
1 answer

Clustering inertia formula in scikit learn

I would like to code a kmeans clustering in python using pandas and scikit learn. In order to select the good k, I would like to code the Gap Statistic from Tibshirani and al 2001 (pdf). I would like to know if I could use inertia_ result from…
Scratch
  • 754
  • 2
  • 6
  • 17
11
votes
2 answers

Metrics for covariance matrices: drawbacks and strengths

What are the "best" metrics for covariance matrices, and why? It is clear to me that Frobenius&c are not appropriate, and angle parametrizations have their issues too. Intuitively one might want a compromise between these two, but I would also like…
Quartz
  • 878
  • 8
  • 18
9
votes
3 answers

Distance metric and curse of dimensions

Some where I read a note that if you have many parameters $(x_1, x_2, \ldots, x_n)$ and you try to find a "similarity metric" between these vectors, you may have a "curse of dimensioality". I believe it meant that most similarity scores will be…
Gerenuk
  • 1,833
  • 3
  • 14
  • 20
9
votes
2 answers

Does a distance have to be a "metric" for an hierarchical clustering to be valid on it?

Let us say that we define a distance, which is not a metric, between N items. Based on this distance we then use an Agglomerative hierarchical clustering. Can we use each of the known algorithm (single/maximum/avaerage linkage etc), to get…
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
1
2 3
22 23