Derivation of expected loss ESL (integrating over conditional expectation confusion)

Question

I am trying to understand the derivation of expected loss (equation 2.11 in Elements of Statistical learning) and there is a specific step I do not understand.

We start with

$EPE(f) = E(Y - f(x))^{2}$

and I understand the derivation up to:

$\int_{x} E_{Y \lvert X}(L(x,y)) p(x) dx$

(all omitted steps can be seen here Confused by Derivation of Regression Function)

However, I do not understand how the above is equivalent to:

$E_{X}E_{Y \vert X}L(x,y)$

Why are we multiplying $E_{X}$ by $E_{Y \vert X}$. Where does $E_{X}$ come from? What probability rule/theorem is responsible for this?

score 1 · Accepted Answer · answered Feb 09 '21 at 10:39

$E_XE_{Y\mid X}$ is not a multiplication, it's a composition.

If both $X,Y$ are random, then $L(X,Y)$ is a real-valued random variable. By taking the first expectation, you "integrate out" $Y$, which produces a new random variable $$ Z(x) = E_{Y\mid X}\left[L(x,Y)\mid X=x\right]$$ This variable is still random because $X$ is. For each event $X=x$ you have $Z(x) = z$.

As with any random variable, you can take the expectation of $Z$ with respect to $X$: $$ E_X[Z(X)] = E_X\left[E_{Y\mid X}\left[L(X,Y)\right]\mid X\right] $$

Derivation of expected loss ESL (integrating over conditional expectation confusion)

1 Answers1