So if I have two GLMs, model_1
and model_2
, I know they're nested if every parameter in model_1
also appears in model_2
(or vice versa).
This means I can then use chi-squared/F tests (depending on the choice of error structure in the GLM) to assess whether the two models are significantly different (and favour the simpler model of the two if they aren't). I'm clear on that point.
For example, in the context of motor insurance (my favourite), if I fit two models:
model_1 = intercept + age_factor + vehicle_group_factor
model_2 = intercept + age_factor
Then these two are nested since everything in model 2 also appears in model 1.
Are two models still nested if the second model is a simplification of the first - for example, a banded factor derived from the original factor? So if I define a factor vehicle_group_factor_banded
which is a banded version of the original ("small"/"midsize"/"large"), for example:
vehicle_group_factor vehicle_group_factor_banded
1 small
2 small
3 small
4 small
5 small
6 midsize
7 midsize
8 midsize
9 midsize
10 midsize
11 large
12 large
13 large
14 large
15 large
... and then fit:
model_1 = intercept + age_factor + vehicle_group_factor
model_2 = intercept + age_factor + vehicle_group_factor_banded
model_3 = intercept + age_factor
then I know I can say that both (1, 3) and (2, 3) are nested. However are (1, 2) considered to be nested?