I'm having trouble to understand Backward elimination in Logistic Regression model. I was looking at this example of Agresti, Categorical Data Analysis, to see how Backward elimination works.
What I understood is:
In the Backward elimination, we first built the full model with all independent variables and interactions. Then we start to look at the interactions of high order,checking whether they are significant or not, and eliminating them.
Let $M_1$ the model $1$ and $M_2$ the model $2$, so to test if the interactions of $order=3$ are significant we test the hypothesis $$H_0:M_2\ is\ the\ true\ model$$ where $$Deviance(M_2)-Deviance(M_1)\sim \chi_3^2$$
Since $Deviance(M_2)-Deviance(M_1)=3.2$ and $p.value=0.36$ we don't reject $H_0$. Then $M_2$ is true and the interactions of $order=3$ are insignificant.
That is the idea? If I already tested all interactions, I can apply the same criterion for the main effects to test if they are significant?
How the AIC criterion works? In this case is better choose the model 8 because that have the smallest AIC?
I read in different places, but I still don't understand well. How can I choose the better method of variable selection?