I'm reading about Monte Carlo methods. Suppose that $X_1,...,X_n$ are i.i.d $p(x_i|\theta)$, where $\theta$ is an unknown parameter of interest. My textbook states: Suppose we could sample some number $S$ of independent, random $\theta$-values from the posterior $p(\theta|x)$. Then the empirical distribution of the sample $(\theta_1,...,\theta_S)$ would approximate the posterior $p(\theta|x)$.
What does it mean to sample from a probability distribution that is unknown? How is this possible? I would greatly appreciate a simple example.