Suppose you have a number of input features, for example:
- x1 - temperature
- x2 - day of the week
- x3 - quantity of rainfall
- ...
You are trying to predict a number of output targets - using neural networks, for example:
- y1: ice cream consumption
- y2: sun cream sales
- y3: umbrella sales
Now, you could build a single regression model for each target variable e.g.:
- Model 1: x1, x2, x3, ..., xn -> y1
- Model 2: x1, x2, x3, .., xn -> y2
However if similar features are useful for each model, it could save resources to build a single model to predict all outputs simultaneously:
- Model: x1, x2, x3, ..., xn -> y1, y2, ...., yn
My questions (two part) is as follows:
- What is the downside from doing this? Is there any literature covering this?
- Does it make more sense to build a multi-output model with correlated target variables? e.g. ice cream consumption and sun cream sales
My experience is that there is a trade-off between: error/accuracy on each target variable and resources required (e.g. training time, maintaining multiple models etc). As you add more target output variables, the error on each target (on the validation set say) increases, but the number of models, training time etc decreases.
Any insight would be greatly appreciated.