4

I started to learn about fitting linear mixed effects models. I am using lme4 package. To demonstrate my problem i used following example.

library("lme4")
model <- lmer(Reaction~Days+(Days|Subject),sleepstudy)
> dim(sleepstudy)
[1] 180   3

So there are 18 subjects with each has repeated measurements of 10 days.

To find confidence intervals ,i found that following code can be used.

> prof=(profile(model))
> confint(prof)
                  2.5 %      97.5 %
.sig01       14.3816801  37.7159899
.sig02       -0.4815003   0.6849854
.sig03        3.8011760   8.7533385
.sigma       22.8982726  28.8579967
(Intercept) 237.6806976 265.1295148
Days          7.3586541  13.5759163

To understand what does this profile function is : I found the dimensions of prof object .

> dim(prof)
[1] 117   8

Can anyone explain why does this 117 rows means ? Is there any relationship between this number 117 and number of rows of the data set which in this case 180 ?

Also :

> head(prof)
      .zeta    .sig01    .sig02   .sig03   .sigma (Intercept)     Days   .par
1 -4.589519  1.005156 1.0000000 5.994445 28.22517    251.4051 10.46729 .sig01
2 -3.962371  4.244636 1.0000000 5.893631 28.32500    251.4051 10.46729 .sig01
3 -3.530075  6.976996 0.8414756 5.542949 27.89200    251.4051 10.46729 .sig01
4 -3.085699  9.346021 0.5732506 5.581650 27.29509    251.4051 10.46729 .sig01
5 -2.601510 11.584054 0.4189560 5.612988 26.78167    251.4051 10.46729 .sig01

Also What is the use of .zeta and .par in calculating confidence intervals ?

Any help would be highly appreciate and it will help me to understand the concepts of linear mixed effects models

student_R123
  • 751
  • 4
  • 13

1 Answers1

3

Let's start by clarifying what the .sig01 (etc) and .sigma represent in the output from confint(). (I figure that you understand, but other readers might not have studied so diligently.) The .sigma is for the standard deviation of the residual error. The others of the form .sig0n are for the standard deviation estimates for the random effects in the model. The default call to profile() provides these cryptic names (you can instead set 'signames=FALSE' in the call), but as the help page for profile() in lme4 says "these are ordered as in getME(.,"theta")." In this case:

> getME(model,"theta")
 Subject.(Intercept) Subject.Days.(Intercept)             Subject.Days 
          0.96673279               0.01516906               0.23090960 

Your call to lmer() allows for random intercepts (Subject.(Intercept)), random slopes (Subject.Days), and correlations between the random slopes and intercepts (Subject.Days.(Intercept)).

Next, it helps to put the deviance profile into the broader context of significance tests for models fit by maximum likelihood,* as explained on this page. Instead of just using a slope (score test) or the estimated width (Wald test) of the relationship between log-likelihood and values of an estimated parameter to gauge its significance, you must calculate the entire log-likelihood profile as a function of its possible values around its maximum-likelihood estimate. With multiple parameters estimated, you have to fix in place one of those possible values at a time and then re-optimize all of the others to get the deviance.

The entire object returned by profile() combines sets of rows from this analysis for all 6 of the estimated parameters. The number of rows in the object just represents how many specific values of parameters happened to be used for this analysis.

The value in the .par column shows which particular parameter was being explicitly varied along a set of rows. In each row within a single set of .par values, the value for the column of the corresponding parameter shows a choice of a value around its maximum-likelihood value, and those for the columns of the other parameters are their re-optimized estimates given that choice for the parameter being varied.

That leaves .zeta. Section 5.1 of the lmer vignette describes in detail where that comes from. For each row (a choice of a parameter and its value), it's a transformation of the likelihood-ratio test statistic to put it on the scale of a standard normal distribution.

We apply a signed square root transformation to this statistic and plot the resulting function, which we call the profile zeta function or ζ, versus the parameter value. The signed aspect of this transformation means that ζ is positive where the deviation from the parameter estimate is positive and negative otherwise, leading to a monotonically increasing function which is zero at the global optimum. A ζ value can be compared to the quantiles of the standard normal distribution. For example, a 95% profile deviance confidence interval on the parameter consists of the values for which −1.96 < ζ < 1.96.

You should find that the 95% confidence intervals reported by confint() for a parameter correspond to its values interpolated at ζ = -1.96 and ζ = 1.96 when it was the parameter being deliberately varied (the rows with its name in the .par column).

Note that the issues around significance tests and confidence intervals for mixed-model parameter estimates can be quite difficult. Ben Bolker's FAQ page provides one good source for initial discussion and entry into the literature, and for discussion of additional issues when you move from standard linear to generalized linear mixed models. Bootstrapping or (pseudo or fully) Bayesian approaches might be considered if you are already willing to pay the extra computational cost of profile likelihood calculations.


*I'm ignoring for simplicity the distinction between REML, the default in your lmer() call, and maximum likelihood.

EdM
  • 57,766
  • 7
  • 66
  • 187