Can fixed-effects become biased due to random structure misspecification

Question

I'm following-up on this great answer. Essentially, I was wondering how could misspecification of random-effects bias the estimates of fixed-effects?

So, can the same set of fixed-effect coefficients become biased if we create models that only differ in their random-effect specification?

Also as a conceptual matter, can we say in mixed-effect models, the fixed-effect coef is some kind of (weighted) average of the individual regression counterparts fit to each individual cluster and that is why fixed-effect coefs in mixed models can prevent something like this Simpson's Paradox case from happening?

A possible R demonstration is appreciated.

I've given an example of this here: [Can adding a random intercept change the fixed effect estimates in a regression model?](https://stats.stackexchange.com/a/478580/42952) — Eoin, Aug 12 '21 at 16:43
@Eoin, thank you, your answer's not exactly the question I'm asking here, you may want to look at the [linked post in my question](https://stats.stackexchange.com/a/540019/140365) which my current question is a follow-up on. But back to your own answer, what, do you think, the fixed-effect coef in your mixed-effect model represents that the OLS coefficient doesn't? For example, can we say in the mixed-effect model, the fixed-effect is some kind of (weighted) average of the individual regression lines and that is why it is more reasonable than the OLS coef? — rnorouzian, Aug 12 '21 at 16:56
In the Simpson's case, the OLS coef estimates the slope across the whole dataset, while the MLM fixed effect estimates the average slope within each group, which, because of the negative relationship between in intercept and slope, is of the opposite sign. — Eoin, Aug 12 '21 at 18:26
@Eoin, so you also like Geoffrey (see comments his answer below) believe that can we in mixed-effect models, the fixed-effect coef is some kind of (weighted) average of the individual regression coefs fit to each cluster and that is why it is more reasonable than the OLS coef and perhaps the reason they can prevent [something like](https://stats.stackexchange.com/a/478580/42952) this from happening? — rnorouzian, Aug 12 '21 at 18:41

score 5 · Accepted Answer · answered Aug 12 '21 at 21:29

Can fixed-effects become biased due to random structure misspecification

Yes they can. Let's do a simulation in R to show it.

We will simulate data according to the following model:

Y ~ treatment + time + (1 | site) + (time | subject)

So we have fixed effects for treatment and time, random intercepts for subject nested within site and random slopes for time over subject. There are many things that we can vary with this simulation and obviously there is a limit to what I can do here. But if you (or others) have some suggestions for altering the simulations, then please let me know. Of course you can also play with the code yourself :)

In order to look at bias in the fixed effects we will do a Monte Carlo simulation. We will make use of the following helper function to determine if the model converged properly or not:

hasConverged <- function (mm) {
  
  if ( !(class(mm)[1] == "lmerMod" | class(mm)[1] == "lmerModLmerTest"))  stop("Error must pass a lmerMod object")
  
  retval <- NULL
  
  if(is.null(unlist(mm@optinfo$conv$lme4))) {
    retval = 1
  }
  else {
    if (isSingular(mm)) {
      retval = 0
    } else {
      retval = -1
    }
  }
  return(retval)
}

So we will start by setting up the parameters for the nested factors:

n_site <- 100; n_subject_site <- 5;  n_time <- 2

which are the number of sites, the number of subjects per site and the number of measurements within subjects.

So now we simulate the factors:

dt <-  expand.grid(
  time = seq(0, 2, length = n_time),
  site = seq_len(n_site),
  subject = seq_len(n_subject_site),
  reps = 1:2
) %>%
  mutate(
    subject = interaction(site, subject),
    treatment = sample(0:1, size = n_site * n_subject_site,, replace =
                         TRUE)[subject],
    Y = 1
    
  )
X <- model.matrix(~ treatment + time, dt)  # model matrix for fixed effects

where we also add a column of 1s for the reponse at this stage in order to make use of the lFormula function in lme4 which can construct the model matrix of random effects Z:

myFormula <- "Y ~ treatment + time + (1 | site) + (time|subject)"

foo <- lFormula(eval(myFormula), dt)
Z <- t(as.matrix(foo$reTrms$Zt))

Now we set up the parameters we will use in the simulations:

# fixed effects
intercept <- 10; trend <- 0.1; effect <- 0.5   

# SDs of random effects 
sigma_site <- 5; sigma_subject_ints <- 2; sigma_noise <- 1; sigma_subj_slopes <- 0.5 

# correlation between intercepts and slopes for time over subject
rho_subj_time <- 0.2  

betas <- c(intercept, effect, trend) # Fixed effects parameters

Then we perform the simulations:

n_sim <- 200
# vectrs to store the fixed effects from each simulations
vec_intercept <- vec_treatment <- vec_time <- numeric(n_sim)   

for (i in 1:n_sim) {
 set.seed(i)

  u_site <- rnorm(n_site, 0, sigma_site) # standard deviation of random intercepts for site

  cormat <-  matrix(c(sigma_subject_ints, rho_subj_time, rho_subj_time, sigma_subj_slopes), 2, 2)  # correlation matrix 
  covmat <- lme4::sdcor2cov(cormat)   

  umat <- MASS::mvrnorm(n_site * n_subject_site, c(0, 0), covmat, empirical = TRUE)  # simulate the random effects

  u_subj <- c(rbind(umat[, 1], umat[, 2]))  # lme4 needs the random effects in this order (interleaved) when there are slopes and intercepts

  u <- c(u_subj, u_site)

  e <- rnorm(nrow(dt), 0, sigma_noise)   # residual error

  dt$Y <- X %*% betas + Z %*% u + e   

  m0 <- lmer(myFormula, dt)
  
  summary(m0) %>% coef() -> dt.tmp

  if(hasConverged(m0)) {
    vec_intercept[i] <- dt.tmp[1, 1]
    vec_treatment[i] <- dt.tmp[2, 1]
    vec_time[i] <- dt.tmp[3, 1]
  } else {
    vec_intercept[i] <- vec_treatment[i] <- vec_time[i] <- NA
  }

}

And finally we can check for bias:

mean(vec_intercept, na.rm = TRUE)
## [1] 10.04665
mean(vec_treatment, na.rm = TRUE)
## 0.497358
mean(vec_time, na.rm = TRUE)
## [1] 0.09761494

...and these agree closely with the values used in the simulation: 10, 0.5 and 0.1.

Now, let us repeat the simulations, based on the same model:

Y ~ treatment + time + (1 | site) + (time|subject)

but instead of fitting this model, we will fit:

Y ~ treatment + time + (1 | site)

So we just need to make a simple change:

 m0 <- lmer(myFormula, dt)

to

 m0 <- lmer(Y ~ treatment + time + (1 | site), data = dt )

And the results are:

mean(vec_intercept, na.rm = TRUE)
## [1] 10.04169
mean(vec_treatment, na.rm = TRUE)
##[1] 0.5068864
mean(vec_time, na.rm = TRUE)
##[1] 0.09761494

So that's all good.

Now we make a simple change:

n_site <- 4

So now, instead of 100 sites, we have 4 sites. We retain the number of subjects per site (5) and the number of time points per subject (2).

For the "correct" model, the results are:

mean(vec_intercept, na.rm = TRUE)
## 10.16447
mean(vec_treatment, na.rm = TRUE)
## [1] 0.422812
mean(vec_time, na.rm = TRUE)
## [1] 0.1049933

Now, while the intercept and time are close to unbiased, the treatment fixed effect is a little off (0.42 vs 0.5, a bias of around -15% which perhaps stregthens the argument for not fitting random intercepts at all for such a small group even when the random structure is correct). But, if we fit the "wrong" model, the results are:

mean(vec_intercept, na.rm = TRUE)
## [1] 10.0194
mean(vec_treatment, na.rm = TRUE)
## [1] 0.7084542
mean(vec_time, na.rm = TRUE)
## [1] 0.1029664

So now we find the bias of around +42%

As mentioned above, there are a huge number of possible ways this simulation can be altered and adapted, but it does show that biased fixed effects can result when the random structure is wrong, as requested.

Dear Robert, [HERE](https://stats.stackexchange.com/q/541797/140365) is a question of possible interest. — rnorouzian, Aug 29 '21 at 01:09

score 2 · Answer 2 · answered Aug 12 '21 at 16:31

2

Heterogeneity alone does not constitute a random variable. It is the act of sampling that does. If, for example, you have collected data at various sites and decide to treat site as a random effect you are saying that these sites were randomly selected from a broader population of sites and individual patients represent repeated measurements on each site. Site is now the primary sampling unit. The endpoint and regression coefficients could be interpreted to pertain to sites instead of people. If you did not actually randomly select these sites from a broader population, that is you would not imagine using an entirely different set of sites each time the experiment is performed, then site should not be treated as a random effect. If sites are not considered random, any inference we derive is on a broader population of subjects who visit the fixed collection of sites in the study. We could include site as a fixed effect but this makes the results site specific which may also not be of interest. If we treat sites as random anyways this can most certainly introduce bias in estimation and inference. It all comes down to the sampling scheme.

answered Aug 12 '21 at 16:31

Geoffrey Johnson

2,460
3
12

Thanks Geoffrey, viewing this from a different angle, can we say in mixed-effect models, the fixed-effect coef is some kind of (weighted) average of the individual regression coefs fit to each cluster and that is why it is more reasonable than the OLS coef and perhaps the reason they can preventing [something like this](https://stats.stackexchange.com/a/478580/42952) from happening? – rnorouzian Aug 12 '21 at 17:25
Great question. Yes, I think that is a very good way to think about the fixed-effect. A good way to interpret Simpson's paradox is to think in terms of causal inference and the sampling scheme. There is actually no right or wrong answer in the figure you linked to. It all depends what you are making inference on. If you are interested in a population of subjects then we should focus on the subject-specific curves and their (weighted) average. – Geoffrey Johnson Aug 12 '21 at 17:46
Geoffrey, I actually asked this [as a question](https://stats.stackexchange.com/q/539946/140365). But the answer was a big NO. That is fixed-effect coefs in mixed-models have NOTHING to do we the random-effects. BUT for a moment, lets say that in fact, what you & I agree (that fixed-effects are some kind of average of cluster-specific coefs) is correct. Then, this would also mean that if we fit models with the exact same fixed-effects but differing random-effects, then fixed coefs are expected to be different, right? But [this answer](https://stats.stackexchange.com/a/540019/140365) says . . . – rnorouzian Aug 12 '21 at 17:53
. . ., that is not the case. And any differnces in fixed coeffs is simply due to bias only and has nothing to do with how one specifies the random-effects. – rnorouzian Aug 12 '21 at 17:54
Agreed that this is a very useful way to assess whether factors should be random or fixed, but it's not the only way, and many times there are conflicting considerations. Examples include where we have the entire population (eg every hospital in the country). See the accp[ted answer to [this](https://stats.stackexchange.com/questions/4700/) for some other considerations. Different areas of applied work tend to put more emphasis on some of these considerations than others. – Robert Long Aug 12 '21 at 20:16
@RobertLong, Dear Robert, I emailed you the class notes, I also noticed Geoffrey and Eoin on this thread believe that we can profitably say that in mixed-effect models, any fixed-effect coef is some kind of (weighted) average of the individual regression coefs fit to each cluster. I wonder what your thoughts are about this issue and whether you still believe that value of fixed-effects have no connection to how random-effects are specified in the model (e.g., whether that fixed-effect has a corresponding random-effect or not, how many grouping variables it is allowed to vary in etc.)? – rnorouzian Aug 12 '21 at 21:29
Thanks for email. I'll take a look. Whether they are a weighted average is beside the point. The meaning/interpretation that you asked about for a fixed effect is the estimated change in the reponse for a unit change in the variable, leaving others unchanged. This meaning and interpretation does not depend on how the fixed effects are estimated. The meaning/interpretaion is the same, regardless. As for bias in the fixed effects resulting from a mis-specified random structure, I have just posted an answer to this question. Feel free to make use of the code yourself or suggest any changes – Robert Long Aug 12 '21 at 21:37
@RobertLong, thank you so very much Roberts, much appreciated. – rnorouzian Aug 12 '21 at 21:39
1

I agree with Robert Long's last statement. – Geoffrey Johnson Aug 12 '21 at 23:57

Can fixed-effects become biased due to random structure misspecification

2 Answers2

Linked