The fundamental difference between estimated marginal means (EMMs) and ordinary marginal means of data (OMMs) is that OMMs summarize the data, while EMMs summarize a model. Thus, if you fit a different model to the data, the EMMs are potentially different. EMMs are not just one thing.
To be a bit more precise, EMMs involve three entities:
- A model for the data
- A grid consisting of all combinations of reference valuses for the predictors. Typically, the reference values are, in the case of factors, the levels of those factors; and in the case of numeric predictors, the means of those predictors.
- A weighting scheme (usually equal weights)
Given these, EMMs are obtained by first using the given model to obtain predictions at each combination of reference values; and then obtaining marginal averages of those predictions according to the weighting scheme.
In the case where equal weights are used, the model is fitted using lm()
(or equivalent), all the predictors are factors, the design is balanced, and the model contains all interactions among these factors, then the predicted values are the cell means of the data, and the EMMs are the same as the OMMs. However, any deviations from these issues -- e.g., unequal weights, not using least-squares, not having balanced data, having some numerical predictors, not having all interactions in the model -- may lead to the EMMs being different from the OMMs.
Some further notes specific to other answers or comments in this thread:
Regarding empty cells, then usually a model with all interactions will be unable to unable to estimate all the grid values, causing some or all of the EMMs to be non-estimable (but see an exception below). Fitting a different model where one or more of the interactions are excluded may lead to the grid values being estimable, and hence the EMMs being estimable.
The question of whether observations are missing at random, not at random, completely at random, etc. is a modeling issue (or, per some comments, whether you trust the model you used). If the model is [in]appropriate or [un]trustworthy, the resulting EMMs will also be [in]appropriate or [un]trustworthy. Some missingness assumptions allow for multiple imputation techniques, and those may (or may not) allow for grid means to be estimable, and will; impact the EMMs accordingly.
Alternative weighting schemes (such as weighting proportionally to marginal frequencies) obviously affect the EMMs as well. A weighting scheme that gives zero weight to any grid combination that is non-estimable will provide estimable EMMs where otherwise they would be non-estimable. In particular, in an (all-factors, all-interactions, least-squares) situation, weighting according to cell frequencies will yield EMMs equal to OMMs.