In the elements of statistical learning in section 2.4 it is written that
it suffices to minimize EPE (Expected Prediction Error) pointwise.
where
EPE = $E_{X} E_{Y/X}([Y-f(X)]^2|X)$
Why minimizing EPE is equivalent to minimizing $E_{Y/X}([Y-f(X)]^2|X)$? I see that an answer has been posted at Expected prediction error - derivation by Matthew Drury.
Matthew said that you can minimize a sum of non-negative quantities by minimizing the summands individually. But I think that is true only when summands are independent of each other. Are we assuming that in this case.