Stepwise Regression

Question

I have some data that I want to fit a log-linear model to (using R). The data in this dataset is categorical and y is the frequency. I use the glm function with family=poisson. I firstly fit the saturated model and check the residuals. I obtain

Clearly, cases 42 and 61 are outliers so should I remove them to continue with the analysis.

Next, I apply backwards elimination and obtain model 1. It has g^2=15.136 on 16 d.f. and p-value 0.514682 and Pearson's test statistic = 14.27873 on 16 d.f. with p-value 0.577. This model has an AIC of 386.3297 and BIC of 489.9561. RSS=2.491516

I then start with the empty model and apply forward selection and obtain model 2. It has g^2= 126.38 on 53 d.f. p-value 6.2327e-0.8 and Pearson's test statistic. =123.9027 on 53 d.f. p-value 1.31x10-07. It has an AIC of 423.5739 and BIC of 447.3217. RSS= 9.967527

Model 1 has higher order terms than model 2 (model 1 has a four way interaction term). Model two has two way interaction terms. Both models are nested.

I know that the different directions of stepwise regression can produce different results. I don't know which model to proceed with because the AIC of model 2 is higher than model 1, yet it has lower order interaction terms and very low p-value.

Any suggestions?

[Do not use stepwise regression](https://stats.stackexchange.com/questions/363821/how-to-do-stepwise-regression-with-a-binary-dependent-variable/363826#363826). — Alexis, Mar 02 '20 at 19:01
Did you click the link in my comment and read it? :) (Also: welcome to CV!) — Alexis, Mar 03 '20 at 03:30
ah okay thank @Alexis however my supervisor has advised me to stick with stepwise regression — Ng123, Mar 03 '20 at 13:40
I recommend your supervisor read the series of cited articles as well. :) — Alexis, Mar 03 '20 at 15:22

Stepwise Regression

0 Answers0