Assume we had a set of data that contained thousands of samples with the following information: gender, age, height, weight, country.
Now, suppose we wanted to build a model for predicting people's heights based on gender, age, weight, and country.
It is clear that in general the mean female height will be a few inches smaller than the mean male height. Is there any benefit to splitting the data by gender and building two separate predictive models (one for men, one for women) in this situation?
In terms of age, we know that, roughly speaking, height will increase from age 0-20 before stabilizing until, say, around 60 years of age, at which point it will slowly decrease.
So we could split the data into age ranges 0-10, 10-20, 20-30, etc., and create a predictive model for each category. Is there any benefit to doing this? Or would it actually be disadvantageous?
In general I am asking about whether we should split the data and build separate models when we have predictors that feature well-known specific patterns. Or will predictive performance be better if we only build a single model that uses all of the data?