Effect of combining predictor variables in a regression model

Question

Let's say I first run a linear regression model Sales = f(TV Spend, Digital Spend).

Now I add TV Spend and Digital Spend and run the second model. My second model is Sales = f(TV Spend+Digital Spend).

How is the beta coefficient of the second model related to the beta coefficients of the first model? Also, does the first model always explain more variance than the second model?

I have no idea why people downvoted this without leaving an explanatory comment. Perhaps you may gain some insights from this Q&A http://stats.stackexchange.com/questions/86269/what-is-the-effect-of-having-correlated-predictors-in-a-multiple-regression-mode?rq=1 although it is not a direct answer to your question. — mdewey, Jan 14 '17 at 11:09
Thanks @mdewey. This is related to the Simpson's paradox and multi-collinearity, your link. I just don't understand how. Here, I am not dropping one of the variables, but combining them. — Sharath G, Jan 14 '17 at 21:10

Pere · Answer 1 · 2017-12-23T19:50:09.153

The first model will always explain more variance than the first one, since it has one predictor more.

On the other hand, the difference may be not significant. Since choosing the second model is just making two parameters in the first one be equal, you can run a F-test with the null hypothesis that those two parameters are equal and decide if it's wort keeping them as separate parameters. Other variable selection criteria can work, too (R ajusted, AIC, Mallow's Cp...).

Detailed explanation of the second paragraph as asked in comment

Your first model is:

$$sales=\beta_0+\beta_1 \cdot tvspend+\beta_2 \cdot digspend$$

Your second model is:

$$sales=\beta_0+\beta_1 \cdot (tvspend+digspend)$$

And that one is equivalent to:

$$sales=\beta_0+\beta_1 \cdot tvspend+\beta_1 \cdot digspend$$

You can see that the difference between your two models is that in the first one each predictor has a different parameter while in the second one both predictors have the same parameter. Claiming that both models are not significantly different is the same that claiming that in the first model $\beta_1=\beta_2$ and that claim can be tested with an F-test.

Thanks @Pere. Not sure I understand "since choosing the second model is just making two parameters in the first one equal". Are you trying to explain the relationship between the parameters in the first model and the second? — Sharath G, Jan 22 '17 at 00:06

Effect of combining predictor variables in a regression model

1 Answers1