R adds terms to linear regression

Question

I have a regression,

y = $\beta_0 + \beta_1x_1 + \beta_2x_1^2 + \beta_3x_1*R1 + \beta_4x_1^2*R1 + \beta_5x_1*R2 + \beta_6x_1^2*R2 + ... u$

Where R1 and R2 are indicators for region 1 and 2 (R3 omitted). It is easy to test $\beta_3 = \beta_4 = 0$ but when I omit the second and third term they get added to the regression when I run lm(). I want to test $\beta_1 = \beta_2 = 0 $ but cant.

Some parts of this question are unclear to me. It is unclear which terms you are referring to (second, third). (do they relate to the terms with $\beta_3$ and $\beta_4$?) Also, could you provide your R formula, code and output? (maybe this is actually more of coding question, which is off-topic here, but it is difficult to say when the question is not so clear) — Sextus Empiricus, Sep 13 '20 at 21:02
`R` offers some nice capabilities for modifying models for this purpose. Here's an example of testing $\beta_1=\beta_2=0:$ `X — whuber, Sep 14 '20 at 14:48

score 3 · Accepted Answer · answered Sep 13 '20 at 20:44

With the interaction terms, the test of $\beta_1=\beta_2=0$ would be misleading in any event.

Yes, when you specify an interaction term of the form $x_1*R_1$, R does automatically expand that to include terms for both $x_1$ and $R_1$ individually. It is seldom a good idea to include an interaction term without terms for the individual predictors. See the discussion on this page.

Even if you did use a semantic trick to convince R to do what you want, the interpretation of $\beta_1$ and $\beta_2$ would depend on how all of the variables interacting with $x_1$ and $x_1^2$ are coded. With the interaction terms included in the model, $\beta_1$ for example would be the rate of change of outcome with respect to $x_1$ when all of its interacting continuous predictors have values of 0 and all of its interacting categorical predictors are at their reference values. If the interactions are significant, then the value of $\beta_1$ would change if, say, sex were one of the interacting predictors and you switched male for female as the reference, or if you decided to center an interacting continuous predictor. See this page among many others on this site.

So a test for $\beta_1=\beta_2=0$ doesn't tell you much directly about $x_1$ and $x_1^2$ individually if they are involved in interactions. You usually want to test whether a predictor along with some or all of its interactions is important to a model.

$R_1$ and $R_2$ are indicators so they are often = 0. – jsdzn001 Sep 14 '20 at 15:15 — jsdzn001, Sep 14 '20 at 15:15

R adds terms to linear regression

1 Answers1