9

I've come across a lemma in the infoGAN paper. I do not understand the derivation of Lemma 5.1 in the addendum of the paper. It goes as follows (included as png):

Lemma 5.1

I do not understand the last step. Why can one pull $f(x,y)$ into the inner-most integral, transforming it into $f(x',y)$? What are the suitable regularity conditions of $f$?

spurra
  • 650
  • 4
  • 22
  • I looked at the paper and I don't think the proof you wrote above is exactly the same as the one in the paper. It looked to me that f(x,y) was pulled out of the innermost integral because it doesn't depend on x'. – Michael R. Chernick Dec 11 '16 at 00:39
  • The png is a screenshot from the paper :) – spurra Dec 11 '16 at 11:07

4 Answers4

5

Consider the difference $$ D = \int_x \int_y P(x,y) \int_{x'} P(x'|y) \left[ f(x,y) - f(x',y) \right] \, dx' dx dy $$ obtained by moving $f(x,y)$ into the $x'$ integral, and taking the difference with $x$ replaced by $x'$. Conditionalizing $x$ on $y$, $$ D = \int_y P(y) \int_x \int_{x'} P(x|y) P(x'|y) \left[ f(x,y) - f(x',y) \right] \, dx' dx dy. $$ This interior object $$ \delta = \int_x \int_{x'} P(x|y) P(x'|y) \left[ f(x,y) - f(x',y) \right] \, dx' dx $$ is antisymmetric after swapping the dummy variables $x$ and $x'$, becoming its own negative, and so it is equal to zero. I suspect that the regularity conditions are simply those that prevent these integrals from diverging.

jwimberley
  • 3,679
  • 2
  • 11
  • 20
  • I haven't had the time yet to go over your answer. I've awarded the bounty to you in good faith as its about to end in 10 mins, and I'll get back to you with any possible clarification questions I have. – spurra Dec 20 '16 at 10:07
  • 1
    Is this a well-known trick? Without your explanation I think it's quite hard to follow the proof in the paper. – Attila Kun Sep 18 '18 at 22:04
  • 1
    @kahoon, William's answer below is pretty much identical to mine but much more straightforward. In fact, I worried about regularity conditions, but I think that other answer shows that these are immaterial. I'd say both tricks are well-known, but the simple swap-and-commute relabelling William shows is probably the way readers were intended to follow along; I think it would have been clearer if they'd added the extra line that William shows. – jwimberley Sep 19 '18 at 01:13
  • @jwimberley Thanks! The "swap x and x'" part of William's answer confused me for a moment but I guess that's legal to do as we're just relabelling the dummy variables right? – Attila Kun Sep 19 '18 at 21:21
  • @kahoon Exactly – jwimberley Sep 19 '18 at 23:01
3

Or, after the third row \begin{align} &=\int_x\int_yp(x|y)p(y)f(x,y)\int_{x'}p(x'|y)dx'dydx\\ &=\int_x\int_yp(x|y)f(x,y)\int_{x'}p(x',y)dx'dydx. \end{align}

Swap $x$ and $x'$ then exchange the order of variables. Done

William
  • 31
  • 1
0

Well, I think it will be more intuitive if we derive the equation reversely as

\begin{align*} E_{x \sim X, y \sim Y|x, x' \sim X|y} \left[ f(x', y) \right] & = \int_x p(x) \int_y p(y|x) \int_{x'} p(x'|y) f(x', y) dx'dydx \\ & = \int_y p(y) \int_x p(x|y) \int_{x'} p(x'|y) f(x', y) dx'dxdy \\ & = \int_y p(y) \int_{x'} p(x'|y) f(x', y) \underbrace{\int_x p(x|y) dx}_{=1}dx'dy \\ & = \int_y p(y) \int_{x} p(x|y) f(x, y) dxdy \\ & = \int_x p(x) \int_{y} p(y|x) f(x, y) dydx \\ & = E_{x \sim X, y \sim Y|x} \left[ f(x, y) \right] \end{align*}

Shun
  • 1
0

The assertion $$ E_{x \sim X, y \sim Y|x} \left[ f(x, y) \right] = E_{x \sim X, y \sim Y|x, x' \sim X|y} \left[ f(x', y) \right]\tag1 $$ is really saying:

If the random vector $(X,Y,X')$ has joint distribution $$P_{X,Y,X'}(x,y,z)=P_X(x)P_{Y|X}(y|x)P_{X|Y}(z|y),\tag2$$ then $E[f(X,Y)] = E[f(X',Y)]$.

The result follows from the fact that $(X,Y)$ has the same distribution as $(X',Y)$, which is seen from: $$P_{X'|Y}(z|y)=\int_x\frac{ P_{X,Y,X'}(x,y,z)}{P_Y(y)}\,dx\stackrel{(2)}= \int_x P_{X|Y}(x|y)P_{X|Y}(z|y)\,dx=P_{X|Y}(z|y).$$ Not much regularity is required here besides the existence of the expectation $Ef(X,Y)$.

grand_chat
  • 2,632
  • 1
  • 8
  • 11