3

Suppose I want to calculate $$E_X[E_Y{g(X,Y)}]$$ Am I supposed to calculate $$ \int \int g(x,y) f(x) f(y) \,dx\,dy$$ or $$\int \int g(x,y) f(x,y) \,dx\,dy $$

where $f(x)$ is marginal pdf and $f(x,y)$ is joint. I assume expectation exists?

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
user1292919
  • 637
  • 3
  • 8

2 Answers2

3

I wonder if any respectable probabilist ever wrote $\operatorname E_Y[g(X,Y)].$ In standard usage, the random variables $X,Y$ have some joint distribution, and $\operatorname E[g(X,Y)]$ means an integral with respect to that joint distribution. If you want to integrate with respect to $Y,$ with $X$ fixed, then you're finding a conditional expected value, and it's written as $\operatorname E[g(X,Y)\mid X].$ That is a quantity whose value is determined by the value of $X.$ It is random because $X$ is random. Since it is random, one may ask what its expected value is, and that is denoted by $\operatorname E[ \operatorname E(g(X,Y)\mid X]],$ and that is an iterated integral whose value is the same as that of $\operatorname E[g(X,Y)].$ The reason for computing it as an iterated integral in that way is often simply that you have a method for calculating each of the two integrals.

One should not use the same symbol, $f,$ two refer to three different functions. What is $f(x)$ when $x=3$ and what is $f(y)$ when $y=3$? They're both $f(3).$ What does that mean if the two $f$s are two different functions? One can write $f_X(3)$ and $f_Y(3)$ and know what it means (with capital $X$ and capital $Y,$ meaning the two random variables).

Now we have $$ \operatorname E[g(X,Y)] = \iint g(u,v)f_{X,Y}(u,v)\, d(u,v) $$ or synonymously $$ \operatorname E[g(X,Y)] = \iint g(s,t)f_{X,Y}(s,t)\, d(s,t) $$ etc. Changing the names of the bound variables doesn't change the value of the integral; one can equally validly call them $x,y.$

A conditional expectation can be written as $$ \operatorname E[g(X,Y)\mid X=x] = \int g(x,y) f_{Y\,\mid\,X\,=\,x}(v)\, dv. $$ This is a function of (lower-case) $x.$ That same function evaluated at (capital) $X$ is a random variable, denoted by $\operatorname E[g(X,Y)\mid X].$

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
  • 1
    dont be so harsh since it also exists in stanford’s notes: https://web.stanford.edu/class/archive/ee/ee278/ee278.1114/expectation4slx4.pdf – gunes Jun 09 '19 at 03:05
  • 4
    @gunes : I am well aware that this usage is often seen in things like writings of professors of electrical engineering and the like, and that is a reason why it continues to be seen so often and thus that is the reason why it is important to denounce it as the nonsense that it is. $\qquad$ – Michael Hardy Jun 09 '19 at 03:16
  • 1
    @gunes : I have thought of writing an account of why this is a mistake and of what to do instead, for the benefit of professors like the one who wrote the notes to which you link, but I don't know where to publish it where those who need it would see it. It should also be circulated among mathematicians in order to tell them how to communicate in language that those foreigners will understand. It would not be in any way harsh but it would say that the mistaken way to do it has no value and just causes confusion and makes some things hard or impossible to understand. – Michael Hardy Jun 09 '19 at 03:20
  • I also don't like this abuse of notation, especially things like $f(x), p(x)$, and find Bertsekas's notation very consistent and extremely precise, which is the one you're using here. However, a lot of important books in ML (especially books with Bayesian perspective), e.g. books of C. Bishop, K. Murphy, S. Theodoridis, T. Hastie, D. Barber, P. Congdon, S. Marsland etc. (which are the ones I could recall now). The thing is all over the literature. It's been long that I've lost my guard on the issue. – gunes Jun 10 '19 at 05:10
  • @gunes : Being all over the literature is the reason why a published explanation of why it's a mistake would be in order. $\qquad$ – Michael Hardy Jun 11 '19 at 00:31
  • I respect that, but it seems too late :) – gunes Jun 11 '19 at 05:27
  • @gunes : I don't think it's too late. Certainly it's too late to prevent it from being all over the literature, but is it too late to instruct people? The problem is: Where can one publish to reach this audience? – Michael Hardy Jun 11 '19 at 05:29
0

The subscript in expectation operator generally denotes which density to use; apparently in your notation, $E_Y[.]$ means the expected value of the inner term using the density of random variable $Y$. So, your first answer is correct. The second one is actually the law of iterated expectations, since $f(x,y)=f(x)f(y|x)$, which results in inner expectation of the form $E_{Y|X}$.

gunes
  • 49,700
  • 3
  • 39
  • 75
  • Using subscripts this way is just nonsense. It is an error and any reply should explain why. – Michael Hardy Jun 09 '19 at 02:51
  • There are so many usages this way, and I just conformed myself. Another old discussion is here: https://stats.stackexchange.com/questions/72613/subscript-notation-in-expectations – gunes Jun 09 '19 at 03:02