1

I have just read the paper https://arxiv.org/pdf/1701.00160.pdf which is a tutorial on GAN. I have a few clarifications:

  1. Must the dimension of the output layer of Generator match the input layer of Discriminator so that the following equation is possible $D(G(z))$ in the equation (1) $$-\frac{1}{2} \mathbb{E}_{x\sim p_{data}(x)}[\log D(x)] -\frac{1}{2}\mathbb{E}_{z\sim p_z(z)}[\log (1 - D(G(z)))]$$ (1)

2.The writers say that equation (1) is simply a cross entropy cost function. I struggle to see how it compares with equation (2). $$C = -\frac{1}{n} \sum_x [y \ln a+(1−y)\ln(1−a)]$$. specifically what does 1/2 in equation (1) represent. Is equation (1) missing truth values i.e $y$ as shown in equation (2) or $y$ is 1/2 in equation (1). Any clarification will be highly appreciated.

Firebug
  • 15,262
  • 5
  • 60
  • 127
onexpeters
  • 13
  • 3

1 Answers1

1
  1. Discriminator must accommodate Generator outputs (fake samples) and also real samples

  2. The true value is

$$y = \cases{1, \quad\text{for input=}x\quad\text{(real)}\\0, \quad\text{for input=}G(z)\quad\text{(fake)}}$$

This is even mentioned just below that equation:

This is just the standard cross-entropy cost that is minimized when training a standard binary classifier with a sigmoid output. The only difference is that the classifier is trained on two minibatches of data; one coming from the dataset, where the label is 1 for all examples, and one coming from the generator, where the label is 0 for all examples.

You'll see this matches the cross-entropy (Bernoulli negative log-likelihood) description. You want to use the probability of a real sample being real $D(x)$ and the probability of a fake sample being fake $1-D(G(z))$.

Firebug
  • 15,262
  • 5
  • 60
  • 127