Is a saturated model a special case of a overfitted model?

Question

I am trying to make sense of what a saturated model is. AFAIK it's when you have as many features as observations.

Can we say a saturated model is a special-case of an extremely overfitted model?

Not exactly -- I think a saturated model used up all its degrees of freedom. It depends on the model what this means exactly. In a log-linear model, for example including all interactions in the model makes it saturated, as df=0, then, but it is not overfitted. — tomka, Dec 02 '13 at 21:47
This thead has some good discussion of this: http://stats.stackexchange.com/questions/283/what-is-a-saturated-model — D L Dahly, Dec 03 '13 at 13:31

Scortchi - Reinstate Monica · Accepted Answer · 2015-01-31T14:22:14.407

@Tomka's right. A saturated model fits as many parameters as possible for a given set of predictors, but whether it's over-fitted or not depends on the number of observations for each unique pattern of predictors. Suppose you have a linear model with 100 observations of $y$ on $x=0$ and 100 on $x=1$. Then the model $\operatorname{E}Y = \beta_0 +\beta_1 x$ is saturated but surely not over-fitted. But if you have one observation of $y$ for each of $x=(0,1,2,3,4)^\mathrm{T}$ the model $\operatorname{E}Y = \beta_0 +\beta_1 x +\beta_2 x^2 +\beta_3 x^3 +\beta_4 x^4$ is saturated & a perfect fit—doubtless over-fitted^†.

When people talk about saturated models having as many parameters as observations, as in the linked web page & CV post, they're assuming a context of one observation for each predictor pattern. (Or perhaps sometimes using 'observation' differently—are 100 individuals in a 2×2 contingency table 100 observations of individuals, or 4 observations of cell frequencies?)

† Don't take "surely" & "doubtless" literally, by the way. It's possible for the first model that $\beta_1$ is so small compared to $\operatorname{Var}Y$ you'd predict better without trying to estimate it, & vice versa for the second.

Nice example the mapping of x={0,1} to 100 ys, thank you. Would you say this definition not accurate then: http://www.stats.gla.ac.uk/glossary/?q=node/448 ? — Ricardo Magalhães Cruz, Dec 03 '13 at 17:24
I'd say just what I said in my 2nd paragraph - it's assuming that context, & a more generally applicable definition might be better. — Scortchi - Reinstate Monica, Jan 11 '14 at 15:36

Is a saturated model a special case of a overfitted model?

1 Answers1

Linked