1

I have a logistic model where I modeled an independent variable at both first order only and then as a second order polynomial (via the poly function in R). I noticed that the p-value for the IV decreased significantly from 10^-4 to 10^-77. However, when I look at the summary output for both first and second order terms, the first order appears to not be statistically significant (high Pr(>|z|)).

             Pr(>|z|)

poly(log(x), 2)1 0.420098
poly(log(x), 2)2 0.000322 ***

Does this mean that the first order term should be removed from the model, and if so what is the syntax to do so?

Frank H.
  • 566
  • 1
  • 4
  • 16

1 Answers1

1

Lack of statistical significance is never a sufficient reason to remove a predictor from a regression model. So the answer to your question is simply, no.

Does it matter that I would leave in the second order term versus? That is, I would not remove the entire variable - both first and second order terms?

If you remove the first order term from the model, you are telling the model that you are 100% confident that the rate of change of y with respect to log(x) is zero at the origin. Are you 100% confident of this?

Matthew Drury
  • 33,314
  • 2
  • 101
  • 132
  • Does it matter that I would leave in the second order term versus? That is, I would not remove the entire variable - both first and second order terms. – Frank H. Jun 07 '17 at 17:43
  • 1
    If you remove the first order term from the model, you are telling the model that you are 100% confident that the rate of change of `y` with respect to `log(x)` is zero at the origin. Are you 100% confident of this? – Matthew Drury Jun 07 '17 at 17:44
  • 1
    Why not edit your comment into the answer Matthew as it goes to the heart of the matter. – mdewey Jun 07 '17 at 17:46
  • @mdewey It is done! – Matthew Drury Jun 07 '17 at 17:47
  • 2
    Removing the first order terms changes the meaning of the second order term. It no longer reflects what you think it reflects. – dbwilson Jun 07 '17 at 17:49