Is my model selection procedure problematic for inference?

Question

I'm not sure if this is "step-wise" model selection, but here is what I'm doing

Decide a handful of models through exploratory data analysis.
Fit the models to the data, and calculate their AIC.
Pick the model with the lowest AIC score.

I've read that step-wise AIC is "primarily problematic for inference", but don't understand the rationale behind this claim. While I know there are differences between inference and prediction, I'm unaware of the restrictions they put on analytical procedures.

Is my model selection procedure problematic for inference? for prediction? Why?

If the "exploratory data analysis" involves the test data *in any way* then there is potential for optimistic bias in the performance estimate for the final model. This potential is not necessarily negligible. It also means your work is not easily reproducible. — Dikran Marsupial, May 11 '19 at 16:08
@DikranMarsupial To be more specific, I'm doing time series analysis, and "exploratory data analysis" means deciding which of the AR/MA/ARMA models are applicable and their approximate orders by looking at the ACF/PACF plot. I suppose that's OK? — nalzok, May 11 '19 at 16:10
ANY choice you make about your model that depends on the data used to evaluate it may introduce a bias. That bias may be negligible, or it may not, my work has shown that some of these biases thought to be negligible (in classification) turned out not to be. The more choices, the larger the bias, likewise the higher the variance of the performance estimate, the more problematic it is likely to be. — Dikran Marsupial, May 11 '19 at 16:28
Likewise, if you use a penalisation method, it may not account for the "degrees of researcher freedom" introduced by the "exploratory data analysis" (essentially because the choice may be influenced by factors that exploit the noise in the data, rather than the underlying structure). Again, often this is entirely negligible, but sometimes it isn't. — Dikran Marsupial, May 11 '19 at 16:32
@DikranMarsupial I see, would you suggest skipping step 1, and using ARMA($p$, $q$) for all $p, q < 5$ as the candidates in step 2 then? This is contrary to [these answers](https://stats.stackexchange.com/questions/134487/analyse-acf-and-pacf-plots) which appears to advocate looking at (P)ACF before fitting the models, so I'm a little confused... — nalzok, May 11 '19 at 16:40
No, the point is that you may need to account for these choices in evaluating performance (e.g. nested cross-validation) and be aware that there may be some over-fitting introduced by the choices that were not subjected to a penalty. — Dikran Marsupial, May 11 '19 at 18:56
@DikranMarsupial What exactly do you mean by "penalty"? AIC "penalizes" complex models by assigning them higher scores, so I suppose models will be penalized equally regardless of how they're chosen? — nalzok, May 11 '19 at 23:57
Performing an exploratory data analysis to make decisions about your model is effectively adding hidden complexity to your model that is not penalised by AIC. You are selecting a model from a class of models using the data. The class of models you start with is the class you look at in the EDA, but that is a bigger class of models than the one to which you apply AIC. — Dikran Marsupial, May 13 '19 at 07:01

Is my model selection procedure problematic for inference?

0 Answers0