1

For generative adversarial neural network, originally Goodfellow used a MinMax formulation as $\text{Min}_D\text{Max}_G \mathbb{E}_{real}logD(x) dx+ \mathbb{E}_{fake}(1-D(G(z)))dz$. As long as the generator $G$ is fixed, the optimal discriminator $D$ is explicit. My question is, as we have a clear understanding of $D$, why not just minimize the Jenson-Shannon divergence as a Min formulation, which is equivalent to the MinMax formulation?

BTW, other researchers raised a similar framework called f-GAN, which replaces the Jenson-Shannon divergence by other $f$-divergence. They also adopted the MinMax formulation, rather than minimizing the divergence directly. Why MinMax formulation is more popular here than Min formulation in these works?

YUAN Zhiri
  • 115
  • 5
  • The Wassertein GAN (specially this post https://lilianweng.github.io/lil-log/2017/08/20/from-GAN-to-WGAN.html by Lilian Weng) helped me a lot to understand – Firebug Apr 02 '21 at 17:05

1 Answers1

2

Why not directly minimize the Jensen-Shannon divergence between the generator and empirical distribution? Because it's intractable to compute. The marginal distribution $p(x) = \int p(x|z)p(z) dz$ is very hard to work with, computationally.

shimao
  • 22,706
  • 2
  • 42
  • 81
  • Not contradicting what you said, but a connection can be established: [GANs loss function is exactly the JS divergence when the discriminator is optimal](https://lilianweng.github.io/lil-log/2017/08/20/from-GAN-to-WGAN.html#what-does-the-loss-function-represent) – Firebug Apr 02 '21 at 17:10
  • Yeah you are right. I just figured this out days ago while trying to implement the algorithm. – YUAN Zhiri Apr 05 '21 at 14:00