Questions tagged [wasserstein]

The Wasserstein metric or Earth Movers Distance is a distance function between probability distributions.

46 questions
60
votes
5 answers

What is the advantages of Wasserstein metric compared to Kullback-Leibler divergence?

What is the practical difference between Wasserstein metric and Kullback-Leibler divergence? Wasserstein metric is also referred to as Earth mover's distance. From Wikipedia: Wasserstein (or Vaserstein) metric is a distance function defined between…
7
votes
1 answer

Multivariate Wasserstein metric for $n$-dimensions

I am a vegetation ecologist and poor student of computer science who recently learned of the Wasserstein metric. Application of this metric to 1d distributions I find fairly intuitive, and inspection of the wasserstein1d function from transport…
7
votes
2 answers

Relation Between Wasserstein Distance and KL-Divergence (Relative Entropy)

Consider the Wasserstein metric of order one $W_1$ (a.k.a. the Earth Movers Distance). I would like to know whether it is possible to link $W_1$ and Kullback–Leibler divergence (a.k.a. relative entropy) and what this would mean intuitively. I can't…
7
votes
2 answers

Calculate Earth Mover's Distance for two grayscale images

I am trying to calculate EMD (a.k.a. Wasserstein Distance) for these two grayscale (299x299) images/heatmaps: Right now, I am calculating the histogram/distribution of both images. The histograms will be a vector of size 256 in which the nth value…
6
votes
1 answer

Earth Movers Distance and Maximum Mean Discrepency

By Kantorovich-Rubinstein duality the Earth Movers Distance (EMD)/Wasserstein Metric is equivalent to Maximum Mean Discrepancy (MMD) correct? See here for a more thorough explanation. Why then does the original Kernel MMD paper compare their…
5
votes
2 answers

Wasserstein Loss is very sensitive to model architecture

I am working on a class project where I compare the performance of GAN and WGAN. Since the only difference between GAN and WGAN is the Wasserstein loss, I chose one neural network model architecture and trained both GAN and WGAN (so, only the loss…
4
votes
1 answer

What is the intuitive difference between Wasserstein-1 distance and Wasserstein-2 distance?

What is the intuitive difference between Wasserstein-1 distance and Wasserstein-2 distance, and how to know which one to use?
develarist
  • 3,009
  • 8
  • 31
4
votes
1 answer

Upper Bound on the Wasserstein Distance

I'm interested to know if it's possible to construct an upper bound on the Wasserstein distance in terms of the Kolgomorov distance. The Wasserstein distance can we written as $$W_{1}\left(F, G\right)=\int_{-\infty}^{\infty}\left|F(x)-G(x)\right| d…
doug
  • 199
  • 7
4
votes
0 answers

Wasserstein distance between Gaussian and the empirical distribution

Wasserstein distance between two gaussians has a well known closed form solution. Does the same hold for the distance between a Gaussian with a fixed variance(say 1) and the empirical data distribution? Empirical data distibution defined as: $$…
4
votes
2 answers

Kullback-Leibler distance for comparing two distribution from sample points

I have two data samples of a value and I want to compute some distance which would represent the difference in their distribution. I read about Kullback-Leibler distance which could be used for comparing two distributions. Would it be the right way…
3
votes
1 answer

Why is "weight clipping" needed for Wasserstein GANs?

I am reading the original paper on the Wasserstein GAN: https://arxiv.org/pdf/1701.07875.pdf and I came across this paragraph: I don't understand the statement: "$\mathcal{W}$ is compact implies that all the functions $f_w$ will be $K$-Lipschitz…
3
votes
1 answer

Computing Wassertein Distance

For two probability measures $\mu$ and $\nu$, the Wassertein Distance is defined as $$W_p (\mu , \nu) = \left[ \inf\limits_{\gamma \in \Gamma} |x-y|^p \, d\gamma (x,y) \right] ^{\frac{1}{p}} \, , $$ where $\Gamma$ is the set of all measures $\gamma…
3
votes
0 answers

Distance or divergence for ordinal distribution

Measures like KL divergence can be symmetrized (into JS divergence). Bhattacharyya distance serves a similar function. Either is well-suited to both continuous distributions and discrete (e.g. multinomial) distributions. Is there a measure that…
Arya McCarthy
  • 6,390
  • 1
  • 16
  • 47
3
votes
0 answers

Prove the existence of a fixed point of a certain mapping of distributions

Let $\tilde{X}_0$ be some random variable on $\mathbb{R}^n$, with a strictly positive p.d.f.. Define: $$X_0:=(\operatorname{var}{\tilde{X}_0})^{-\frac{1}{2}}(\tilde{X}_0-\mathbb{E}\tilde{X}_0),$$ where we take the unique positive definite matrix…
cfp
  • 425
  • 2
  • 10
3
votes
0 answers

In a WGAN, when is the generator's loss function ever used?

I've been building a Wasserstein GAN in Keras recently following the original Arjovsky implementation in PyTorch and ran across an issue I've yet to understand. To my knowledge, the critic network is first trained on a real batch of data, then…
fathomer
  • 31
  • 1
1
2 3 4