One predictive model for all data vs subject/group specific models

Question

I noticed that in some (rare) situations, training subject/group/subpopulation specific models is preferred to one general predictive model for all data (probably due to accuracy?).

For example, in the case of a medicine data, I saw that a predictive model is trained for each patient separately (instead of one model for all patients). In other field, I have seen training separate predictive models for each geographic region.

Under what circumstances are the subject/group specific models often preferred to one general model?
Why is one general model with a group variable as a feature not enough?
What are the advantages of subject/group specific models over one single model?

score 1 · Answer 1 · answered Nov 12 '16 at 13:44

My guess is that researchers who are unfamiliar or uncomfortable with hierarchical/mixed-effects models might break their data up and create separate models. I don't see any advantage to broken-up models unless your separate groups have very different covariances and your modeling technique adjusts to this for you.

Hierarchical/mixed-effects models allow for sharing of strength among groups, which means that smaller groups will be pulled more towards the overall mean, while larger groups -- those with more information -- are more independent. The amount of pooling is determined by the data.

You do need a fair number of groups to get reasonably sharp distributions of group coefficient: one rule of thumb is at least 5 groups, though I've also seen at least 30 groups. But I believe I've read Andrew Gelman saying that a hierarchical model won't do worse than separate models.

One predictive model for all data vs subject/group specific models

1 Answers1

Linked