1

I am currently reading about GANs and I had a question about latent space.

A site mentions:

Latent space refers to an abstract multi-dimensional space containing feature values that we cannot interpret directly, but which encodes a meaningful internal representation of externally observed events.

Furthermore, this answer mentions

The latent space itself has no meaning. Typically it is a 100-dimensional hypersphere with each variable drawn from a Gaussian distribution

From my understanding, certain points from the latent space are inputted through the generator model during training.

If the latent space is just a Gaussian distribution, how does it encode "meaningful internal representation of externally observed events"?

ianc1339
  • 113
  • 6
  • The use of Gaussian distributions for GANs latent space is somewhat arbitrary (we could have used other distributions) since the model transforms the distribution at each layer, so basically we could feed it a uniform distribution from which the GAN model would internally transform into a Gaussian if it needs to be. But it can be justified with its mathematical simplicity and the fact that it works well in practice. It might also be something that was "inherited" from the practice on latent space of variational autoencoders. – Ilyas Moutawwakil Jan 29 '22 at 23:25

1 Answers1

1

We say the latent state is a "meaningful internal representation" because manipulations and transformations to a latent vector $z$ results in meaningful changes in the observed output $x = f(z)$.

"Meaningful changes" is a pretty vague term, but it's commonly used to say something like: for $z_1, z_2$ which map to $x_1, x_2$ if we define $x_3 = f(0.5 z_1 + 0.5 z_2)$, it looks like a mixture of $x_1$ and $x_2$. For semantic attributes $a$ such as "hair color" or "brightness" or "face orientation", $a(x_3)$ is somewhere between $a(x_1)$ and $a(x_2)$.

None of this so far makes any assumption about the construction of $f$, or that there is any distribution on the latent space.

Now suppose you choose some distribution over the latent space $p(z)$, which induces a corresponding $p(x)$. The goal of training a GAN (or any latent variable generative model) is to make $p(x)$ match up with the empirical data as best as possible.

So to answer your question:

  • The "meaningfulness" of a latent space is a property of the mapping $f$, not the distribution over $z$. $^1$
  • Separately, the gaussian distribution $p(z)$ is chosen mostly for empirical reasons, but this is kind of independent of the latent space -- I could define a uniform distribution $p'(z)$, and sample from the resulting $p'(x)$, and even train a GAN using $p'$.
  • So how does it happen that the latent space / mapping is meaningful? This is kind of just an empirical fact about the world, that if you train a GAN, you will get such nice properties.

$^1$ This isn't entirely true, I think in practice you expect the nice meaningful properties of a latent space only to hold up within a certain region near the origin, for most common values of $f$.

shimao
  • 22,706
  • 2
  • 42
  • 81