i have two different weather forecasting systems. Each system returns values between 0 and 30 degrees. In addition i have a grounded truth set containing the real temperature values. Now i want to find the optimal mix of both systems using an optimization (i.e minimize):
$$ \sum_{d=1}^D \big|t_d-(w_{S1}\widehat{t_{d,S1}}+w_{S2}\widehat{t_{d,S2}})\big| \to \min $$
(where $d$ refers to days, $t$ to actual or predicted temperatures and $S1, S2$ to my two forecasting systems), under the constraint that
$$ w_{S1}+w_{S2} = 1. $$
Unfortunately this tends to overfit (even with crossvalidation) and the final predicted values are not as good.
Therefore, i thought about integrating a regularization parameter like for example
$$ \dots + \lambda\big(w_{S1}^2+w_{S2}^2\big) $$
to my equation.
The rationale behind this is that both systems then tend to mix 50/50, and I think that they would perform better on real world data than if one system has a large and the other a smaller weight (i.e the distances between both are larger).
Does this make sense? Is the regularization suitable? Are there other options?