2

I know this question is quite weird. How can you control, for example, the output of latent space is indeed mean and log sigma? Is it because it's user defined quantity ? And how the latent space is ordered as mean and log standard deviation, not ordered as log standard deviation and mean?

flashing sweep
  • 433
  • 2
  • 9

1 Answers1

2

The latent space is restricted to the desired prior by the KLD penalty. The more the model deviates from the desired prior, the larger the penalty. The desired distribution is achieved when the KLD penalty is 0 but this ideal is often not practical to achieve, so instead we accept a certain amount of non-conformity to the desired prior in order to reduce the total model loss to a smaller quantity.

When the network is first initialized, it has no feedback about what the order of the units should be. But after the first batch is back-propagated and gradient updates are applied, the network has been informed that the gradient for the mean moves in a certain way according to the KLD penalty, so it adjusts accordingly; and likewise the parameters for the log-variance are adjusted.

Sycorax
  • 76,417
  • 20
  • 189
  • 313