I am running a Generalized Linear Mixed Model analysis in SPSS 25, and have gotten to the point where I would like to justify the selection of my final model based on information criteria.
However, there is one thing I do not understand. SPSS reports p-values for individual parameters based on the Wald-statistic, and this tells me whether the effect of some IV is significant or not. I assume the general way to proceed about doing this is estimating AIC/BIC for the largest candidate model first, then removing variables one by one, recalculating AIC/BIC, and comparing them for fit. I started by removing my non-significant variables from the model first,one by one, and as expected, AIC/BIC both favored the new, simpler models. So far, so good. I noticed however, than even if I remove my significant IVs, AIC/BIC still become smaller, the simpler the model becomes, regardless of whether the removed variable had a significant effect or not. This is not entirely unexpected, as (if I understand correctly) AIC/BIC are basically adjusted versions of -2LL, penalized for model complexity. Nonetheless, I still don't understand how I can use them for model selection if they always favor the simplest possible model, that is, the one with zero parameters. Is there something I am missing?
EDIT: After some further testing, it appears that AIC favors the model with only fixed effects and their interactions retained, without random slopes. Even though I have two random slopes which are significant. Is it okay, to remove such random parameters, based on AIC?