I have some potential spline models and I'm trying to use AIC or BIC to choose variables. I'm seeing that AIC is lower when I use all variables than if I exclude any one or two. However, if I exclude three variables, then AIC is lower than if I include all variables. Is this theoretically possible or am I potentially doing something wrong? My concern is that if I do backwards selection with AIC, the optimal model chosen will be all variables since incrementally dropping one will not reduce AIC.
Sorry I can't post a reproducible example as I have a quite complex spline model and the data is proprietary to my company.