I have run a stepwise regression on R. However, the summary of the final model includes some factors that are not significant. Why have these factors not been removed? Should I remove these from my model? The VIFs of these factors are all under 5.
-
Mind sharing the method that you used? Was it stepAIC? – Ben Ogorek Jul 29 '14 at 03:50
-
I just used the code "summary(step(model))". "Model" being the name of the model used. – Blair Outhwaite Jul 29 '14 at 03:59
-
4What made you use stepwise regression? Do you know how to run simulation studies that demonstrate how poor these methods perform? – Frank Harrell Jul 29 '14 at 04:21
-
2Do you mean some *levels* of one or more factors are not significant in the output from `summary(model)`? Stepwise methods should rightly work on the amount of variance (expressed in one of a number of ways) explained by an entire term - i.e. over all levels of a factor. Some levels may not be significant but one or more levels will be. However, what you can infer from the $t$ stats and their p-values in that summary output is limited owing to multiple testing (one per $t$) and, *more importantly* the inherent problems of stepwise procedures which render the $p$ values largely uninformative. – Gavin Simpson Jul 29 '14 at 04:26
-
I bet if you used $\alpha = 0.1573$ they're all significant, though. How'd I do? [Save your applause, though](http://stats.stackexchange.com/questions/97257/stepwise-regression-in-r-critical-p-value/97309#97309), it's just a little algebra. – Glen_b Jul 29 '14 at 06:18
-
Does [this](http://stats.stackexchange.com/questions/97257/stepwise-regression-in-r-critical-p-value/97309#97309) count as effectively a duplicate? – Glen_b Jul 29 '14 at 06:23
-
From the output, if I am reading it correctly then it's all levels of the factor that are insignificant. I am using stepwise regression as this is what our lecturer wants us to use. – Blair Outhwaite Jul 29 '14 at 22:09
1 Answers
Blair,
The reason why the final model includes terms with p-values above the customary threshold is that the function you used, step, uses a different criterion called the "AIC." AIC is a summary evaluation of the entire model at each stage, and one model may have a smaller (i.e., better) AIC even though it contains terms with higher p-values.
If you want to learn about AIC, there are a few ways to approach it. One is to see it as a penalized likelihood. Another is to view it from the lens of Information Theory.
The AIC-based sequential method is a competing (or perhaps complementary) algorithm to the one you were probably thinking of based on p-values. Either one has merits depending on the context. By the way, sequential selection is still an area of active research.

- 4,629
- 1
- 21
- 41