3

So if I have two GLMs, model_1 and model_2, I know they're nested if every parameter in model_1 also appears in model_2 (or vice versa).

This means I can then use chi-squared/F tests (depending on the choice of error structure in the GLM) to assess whether the two models are significantly different (and favour the simpler model of the two if they aren't). I'm clear on that point.

For example, in the context of motor insurance (my favourite), if I fit two models:

model_1 = intercept + age_factor + vehicle_group_factor
model_2 = intercept + age_factor

Then these two are nested since everything in model 2 also appears in model 1.

Are two models still nested if the second model is a simplification of the first - for example, a banded factor derived from the original factor? So if I define a factor vehicle_group_factor_banded which is a banded version of the original ("small"/"midsize"/"large"), for example:

vehicle_group_factor    vehicle_group_factor_banded
1                       small
2                       small
3                       small
4                       small
5                       small
6                       midsize
7                       midsize
8                       midsize
9                       midsize
10                      midsize
11                      large
12                      large
13                      large
14                      large
15                      large

... and then fit:

model_1 = intercept + age_factor + vehicle_group_factor
model_2 = intercept + age_factor + vehicle_group_factor_banded
model_3 = intercept + age_factor

then I know I can say that both (1, 3) and (2, 3) are nested. However are (1, 2) considered to be nested?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Alan
  • 201
  • 1
  • 5
  • Thank you. Yes the small/midsize/large would be a subset of the vehicle group. So it might be (for example) that vehicle groups 1-5 are mapped to "small", vehicle groups 6-10 are mapped to "midsize" and vehicle groups 11-15 are mapped to "large". In that case I think you're saying that I can deem these two models to be nested, yes? – Alan Aug 22 '20 at 12:29
  • Oh, two comments (one saying yes, one saying no) and they've both disappeared! – Alan Aug 22 '20 at 12:37
  • I was wrong and therefore deleted my answer. If you write out your model (one hot encoding the factors) you will see that model 2 is not a subset of model 1. – J.C.Wahl Aug 22 '20 at 12:40
  • Ah ok so if I write it using a bunch of dummy factors, such as “+ vehicle_group_factor_1 + vehicle_group_factor_2 ... vehicle_group_factor_15” – Alan Aug 22 '20 at 12:46
  • Yes, that is correct:) And this is actually what the model looks like, you get one coefficient for each level in the factor. – J.C.Wahl Aug 22 '20 at 13:24
  • Yes, any model with the banded (simplified) factor is nested relative to (a simplified version of) the same model with the full factor. This will be true no matter how the dummy variables are defined. The two models can use completely non overlapping dummy variables and the banded factor model will still be nested relative to the full model. – Gordon Smyth Aug 23 '20 at 01:09

1 Answers1

3

Yes, the model with a banded variable is still nested. This is because any fit with the banded variable can be matched with the larger model, by suitable restricting (setting equal) some parameters. Full details here: Nested GLM Models

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467