In the analysis of correlated data, you're often working with replicates where a worker does a task 2 or 20 times, or an individual is measured for blood pressure randomly over several times during the day. In either case, you are aware of some sense of "similarness" within these clusters (like workers or people).
On one hand, a random intercepts model with estimate a separate $\beta_0$ parameter for each individual under some reasonable distributional assumptions. On the other hand you can fit a generalized least squares model with a compound symmetry (or exchangeable) correlation matrix. I understand that if the variability of the random intercepts is relatively large, it means that the offdiagonal compound symmetry should be relatively large as well.
But are they equivalent formulations and/or do they give similar inference? Is one more powerful than the other? How exactly do data generating mechanisms and/or assumptions determine which should be used instead of the other?