Non-significant factors after stepwise regression

Question

I have run a stepwise regression on R. However, the summary of the final model includes some factors that are not significant. Why have these factors not been removed? Should I remove these from my model? The VIFs of these factors are all under 5.

I just used the code "summary(step(model))". "Model" being the name of the model used. — Blair Outhwaite, Jul 29 '14 at 03:59
What made you use stepwise regression? Do you know how to run simulation studies that demonstrate how poor these methods perform? — Frank Harrell, Jul 29 '14 at 04:21
Do you mean some *levels* of one or more factors are not significant in the output from `summary(model)`? Stepwise methods should rightly work on the amount of variance (expressed in one of a number of ways) explained by an entire term - i.e. over all levels of a factor. Some levels may not be significant but one or more levels will be. However, what you can infer from the $t$ stats and their p-values in that summary output is limited owing to multiple testing (one per $t$) and, *more importantly* the inherent problems of stepwise procedures which render the $p$ values largely uninformative. — Gavin Simpson, Jul 29 '14 at 04:26
I bet if you used $\alpha = 0.1573$ they're all significant, though. How'd I do? [Save your applause, though](http://stats.stackexchange.com/questions/97257/stepwise-regression-in-r-critical-p-value/97309#97309), it's just a little algebra. — Glen_b, Jul 29 '14 at 06:18
Does [this](http://stats.stackexchange.com/questions/97257/stepwise-regression-in-r-critical-p-value/97309#97309) count as effectively a duplicate? — Glen_b, Jul 29 '14 at 06:23
From the output, if I am reading it correctly then it's all levels of the factor that are insignificant. I am using stepwise regression as this is what our lecturer wants us to use. — Blair Outhwaite, Jul 29 '14 at 22:09

score 0 · Answer 1 · answered Jul 29 '14 at 05:36

Blair,

The reason why the final model includes terms with p-values above the customary threshold is that the function you used, step, uses a different criterion called the "AIC." AIC is a summary evaluation of the entire model at each stage, and one model may have a smaller (i.e., better) AIC even though it contains terms with higher p-values.

If you want to learn about AIC, there are a few ways to approach it. One is to see it as a penalized likelihood. Another is to view it from the lens of Information Theory.

The AIC-based sequential method is a competing (or perhaps complementary) algorithm to the one you were probably thinking of based on p-values. Either one has merits depending on the context. By the way, sequential selection is still an area of active research.

Non-significant factors after stepwise regression

1 Answers1