Which test does the lsmeans package use to compare the means in pairwise tests and what are its assumptions?

Question

I'm currently using lsmeans to compare the means of various groups using the contrast argument. I'm using data that follows a Gaussian distribution. Would an unbalanced data structure affect the rate of type I errors or inflate them?

Russ Lenth · Accepted Answer · 2018-06-15T17:53:42.687

5

It uses $t$ tests, the observed contrast divided by the estimated standard error. It gets this information from the fitted model. Thus, the validity of the result depends on the validity of the model.

If, for example, you fitted a model using lm(), and that the errors are actually normally distributed with common variance (as assumed in the model), and the model structure itself is correct (no missing predictors), then the $t$ statistics from lsmeans() are correct, even with unbalanced data, and there is no biasing of type I errors when unadjusted tests are used, on a per-test basis.

However, most multiplicity adjustments, e.g. Tukey, are approximate when there is imbalance. The ‘“mvt”` adjustment is exactct in principle, but has slight anomalies due to the fact that the P values are computed using a simulation method.

edited Jun 15 '18 at 17:53

answered Jun 15 '18 at 03:30

Russ Lenth

15,161
20
53

So for unbalanced datasets, you suggest using the "mvt" method? Do you know if lsmeans supports linear mixed effects models generated by lme4? – ziab_m Jun 19 '18 at 01:25
Yes, but I suggest switching to the **emmeans** package (successor to **lsmeans**) where all new development is taking place. – Russ Lenth Jun 19 '18 at 01:28
PS look at `vignette(“models”, “emmeans”)` for info and details on what models are supported. – Russ Lenth Jun 19 '18 at 01:30

Which test does the lsmeans package use to compare the means in pairwise tests and what are its assumptions?

1 Answers1