From what I've read, the main advantage of the EM algorithm is that the expectation step can be expressed in closed form giving a deterministic answer and thus 0 variance.
What's the rationale then behind MCEM (Monte Carlo EM) methods [1] which use sampling to calculate the E-step? Specifically, is there a theoretical/empirical evidence that MCEM gives lower variance than just doing sampling on the full likelihood or are there some other advantages of the EM algorithm that come into play here?
[1] http://www.biostat.jhsph.edu/~rpeng/biostat778/papers/wei-tanner-1990.pdf
Edit: To clarify, I mean that if you have the likelihood $\log \sum_z p(y|x)$, $z$ being latent, then one option is to use EM (or MCEM if your approximating distribution cannot give you a closed form). The other way I can see is to estimate the sum directly via sampling. So my question is if you're using sampling anyway, why use MCEM over directly integrating the likelihood.
Edit 2: Replaced MCMC with sampling which is what I had in mind -- got the names confused, sorry.