Relevance of Mixed Model Estimates vs. Observed Means

Question

This question is a follow-up to a previous question I asked regarding mixed model effects construction, linked here. It provides some background, although this is a broader question with little to do with specifically my model.

I have since constructed my model, taking care that assumptions are met etc. But at the risk of asking a perhaps noobish question, I'm struggling to get my head around the output and how to go about reporting it.

Via the model I can generate what I have taken as estimates for the mean of each of my 10 treatment conditions for each of the three time steps, done via simple addition method seen in Robert's answer here. But after comparison it is evident that these model-based predictions slightly deviate from my observed means that I calculate from my raw data, as you would expect from a prediction of a trend taking into account the random effects and interactions. So my primary question here is whether these predictions are at all valuable when discussing my results (i.e. under what circumstances one would refer to them/graphically depict them) or whether that is not their typical purpose, and I should stick to the observed means for discussion.

Cheers for any help.

Does this answer your question? [What are LS means useful for?](https://stats.stackexchange.com/questions/332167/what-are-ls-means-useful-for). With unbalanced data, as you have, the raw means will not coincide with the results reported from your mixed model. See [this vignette](https://cran.r-project.org/web/packages/emmeans/vignettes/basics.html) for further information. — EdM, Aug 03 '20 at 21:28
These links are great, thanks, especially the vignette. However, I'm not sure if I'm yet fully confident in the appropriate situation yet, as from the links I gather it's more an active point of disagreement, based on individual opinion rather than any established way to approach things. I suppose what I'm after is a published example of a model like mine, with an interaction term as the fixed effect, so I can see which set of means they use. My problem is I'm coming from an ecology/evolution background and LMEM appears quite sparse in that field. Appreciate any such links. — Calum Stephenson, Aug 04 '20 at 12:41
The mixed-effect aspect has little to do with this. The issue is the imbalance in the data set, which would pose a similar problem with a fixed-effect model. Raw means do not represent the phenomena underlying the data well; in your study they depend on the vagaries of which particular cases happened to be lost. Regression coefficients returned by fixed-effect models from unbalanced data don't agree with raw means either: yet would you not report those coefficients in preference to the raw means? I'm afraid that the link I provided over-emphasizes the "disagreement." — EdM, Aug 04 '20 at 14:45

score 0 · Answer 1 · answered Aug 04 '20 at 18:56

The estimates are marginal estimates of populations that contain the random effects, while the raw means are pure sample estimates. Suppose you bootstrapped the raw data. It is quite likely that the bootstrapped mean would differ from the raw mean. Different estimators for different estimands.

Relevance of Mixed Model Estimates vs. Observed Means

1 Answers1