Questions tagged [euclidean]

Euclidean distance is the intuitive notion of a 'straight-line' distance between two points in a Euclidean space.

Euclidian is an an adjective derived form the Name of Euclid. Euclid was a Greek mathematician living around 300 BC. He is known as the "founder of geometry".

Euclidian refers to many topics in geometry, e.g.

  • Euclidian space
  • Euclidian geometry
  • Non-euclidian geometry
  • Euclidian distance
  • Euclidian ball

and in number theory, e.g.

  • Euclidian division
  • Euclidian algorithm
  • Euclid's lemma

A Euclidian distance is the straight-line distance between two points in Euclidian space. Euclidian space is the conventional 2D or 3D (or for that matter $n$D) space that we usually use. It is also called flat space.

142 questions
90
votes
7 answers

Euclidean distance is usually not good for sparse data (and more general case)?

I have seen somewhere that classical distances (like Euclidean distance) become weakly discriminant when we have multidimensional and sparse data. Why? Do you have an example of two sparse data vectors where the Euclidean distance does not perform…
shn
  • 2,479
  • 9
  • 31
  • 38
84
votes
6 answers

Why does k-means clustering algorithm use only Euclidean distance metric?

Is there a specific purpose in terms of efficiency or functionality why the k-means algorithm does not use for example cosine (dis)similarity as a distance metric, but can only use the Euclidean norm? In general, will K-means method comply and be…
curious
  • 971
  • 1
  • 7
  • 7
35
votes
2 answers

Is cosine similarity identical to l2-normalized euclidean distance?

Identical meaning, that it will produce identical results for a similarity ranking between a vector u and a set of vectors V. I have a vector space model which has distance measure (euclidean distance, cosine similarity) and normalization technique…
28
votes
1 answer

Converting similarity matrix to (euclidean) distance matrix

In Random forest algorithm, Breiman (author) constructs similarity matrix as follows: Send all learning examples down each tree in the forest If two examples land in the same leaf increment corresponding element in similarity matrix by 1 Normalize…
Uros K
  • 467
  • 1
  • 6
  • 9
20
votes
5 answers

How I can convert distance (Euclidean) to similarity score

I am using $k$ means clustering to cluster speaker voices. When I compare an utterance with clustered speaker data I get (Euclidean distance-based) average distortion. This distance can be in range of $[0,\infty]$. I want to convert this distance to…
Muhammad
  • 331
  • 1
  • 2
  • 5
17
votes
4 answers

Definition of normalized Euclidean distance

Recently I have started looking for the definition of normalized Euclidean distance between two real vectors $u$ and $v$. So far, I have discovered two apparently unrelated…
PTDS
  • 679
  • 1
  • 4
  • 10
15
votes
1 answer

Cosine Distance as Similarity Measure in KMeans

I am currently solving a problem where I have to use Cosine distance as the similarity measure for k-means clustering. However, the standard k-means clustering package (from Sklearn package) uses Euclidean distance as standard, and does not allow…
MSalty
  • 255
  • 1
  • 2
  • 5
14
votes
3 answers

Which distance to use? e.g., manhattan, euclidean, Bray-Curtis, etc

I am not a community ecologist, but these days I am working on community ecology data. What I couldn't understand, apart from the mathematics of these distances, is the criteria for each distance to use and in what situations it can be applied. For…
user36491
  • 355
  • 2
  • 3
  • 7
11
votes
1 answer

Pros of Jeffries Matusita distance

According to some paper I am reading, Jeffries and Matusita distance is commonly used. But I couldn't find much information on it except for the formula below JMD(x,y)=$\sqrt[2]{\sum(\sqrt[2]{x_i}-\sqrt[2]{y_i})^2}$ It is similar to Euclidean…
romy_ngo
  • 213
  • 2
  • 5
10
votes
1 answer

Why is Kullback-Leilbler divergence a better metric for measuring distance between two probability distributions than squared error?

I know that KL-divergence is a metric that is more suitable when we want to measure the distance between numbers which a probability form. However, I am still confused what is the benefit of using KL-divergence rather than squared error between…
Kadaj13
  • 355
  • 2
  • 8
10
votes
1 answer

My neural network can't even learn Euclidean distance

So I'm trying to teach myself neural networks (for regression applications, not classifying pictures of cats). My first experiments were training a network to implement an FIR filter and a Discrete Fourier Transform (training on "before" and…
8
votes
2 answers

Expected magnitude of a vector from a multivariate normal

What is the expected magnitude, i.e. euclidean distance from the origin, of a vector drawn from a p-dimensional spherical normal $\mathcal{N}_p(\mu,\Sigma)$ with $\mu=\vec{0}$ and $\Sigma=\sigma^2 I$, where $I$ is the identity matrix? In the…
8
votes
2 answers

K-means: Why minimizing WCSS is maximizing Distance between clusters?

From a conceptual and algorithmic standpoint, I understand how K-means works. However, from a mathematical standpoint, I don't understand why minimizing the WCSS (within-cluster sums of squares) will necessarily maximize the distance between…
slaw
  • 464
  • 1
  • 5
  • 16
8
votes
1 answer

Efficient way to compute distances between centroids from distance matrix

Let us have square symmetric matrix of squared euclidean distances $\bf D$ between $n$ points and vector lengthed $n$ indicating cluster or group membership ($k$ clusters) of the points; a cluster may consist of $\ge1$ point. What is the most…
ttnphns
  • 51,648
  • 40
  • 253
  • 462
7
votes
2 answers

Variance and asymptotic normality of $\frac{1}{n-1}\sum_{i=1}^{n-1}(x_{i+1}-x_i)^2$, where $X \sim \mathcal{N}(0,1)$

Consider a length $n$ vector $\mathbf{x}$ containing $n$ i.i.d. observations $\{x_i\}_{i=1}^n$ of a standard normal random variable $X$. Let $\mathbf{z}$ be a length $n-1$ vector whose entries are $z_i = x_{i+1}-x_i$. I have a statistic…
eyeExWhy
  • 556
  • 2
  • 9
1
2 3
9 10