1

In the analysis of correlated data, you're often working with replicates where a worker does a task 2 or 20 times, or an individual is measured for blood pressure randomly over several times during the day. In either case, you are aware of some sense of "similarness" within these clusters (like workers or people).

On one hand, a random intercepts model with estimate a separate $\beta_0$ parameter for each individual under some reasonable distributional assumptions. On the other hand you can fit a generalized least squares model with a compound symmetry (or exchangeable) correlation matrix. I understand that if the variability of the random intercepts is relatively large, it means that the offdiagonal compound symmetry should be relatively large as well.

But are they equivalent formulations and/or do they give similar inference? Is one more powerful than the other? How exactly do data generating mechanisms and/or assumptions determine which should be used instead of the other?

AdamO
  • 52,330
  • 5
  • 104
  • 209

1 Answers1

0

I discovered the answer in Pinhero, Bates "Mixed-Effects Models in S and S-plus" section on Compound Symmetry, chapter 5.

Basically, they are similar up to an extent. Only compound symmetry allows for $\rho$ to be negative, in that observations within a cluster can be more dissimilar than observations between clusters. For example, this might be the case for sports teams where players are chosen for complementary abilities.

There is a very simply heuristic argument to show that a single level random intercepts model is a special case of a compound symmetry model: If the random intercepts is the true data generating mechanism:

$$Y_{ij} = \beta_0 + \beta_1 X_{ij} + b_i + \epsilon_{ij}$$

With $b_i \sim \mathcal{N}\left(0, \sigma_b \right)$ and $\epsilon_{ij} \sim \mathcal{N} \left(0, \sigma \right)$ independent, and suppose further we fit the fixed effects model: $E[Y_{ij}|X_{ij}] = \beta_0 + \beta_1 X_{ij}$. (NB I'm using the same $\beta$ notation as the parameter estimates are still unbiased), then the covariance between two residuals (from the fixed effects model) in the same cluster is:

$$ \mbox{Cov}(e_{i1}, e_{i2}) = \sigma^2 + \sigma^2_b$$

Leading to a strictly positive valued correlation.

AdamO
  • 52,330
  • 5
  • 104
  • 209