Reading a text on Transformation and Variate Generation that says that if
- we want to simulate V,
- but we know only its conditional distribution given U,
- and we can simulate U
then we can simulate V as follows:
- Draw u_sim from the marginal distribution of U
- Draw v_sim from the conditional distribution of V given U=u_sim
then v_sim is a variate from the marginal distribution of V. This is the part I don't understand: why are we allowed to say we have simulated from the marginal distribution of V and not the conditional $p(v|u)$?
Maybe this is just an approximation, and this distribution converges on $p(v)$ for large samples?