It's pretty well acknowledged that error control via p-values fails when models are selected based on the data rather than decided on a priori. I've always viewed this as an issue of marginal vs conditional distributions, such that: $$P(error) = P(error | M = M_1)P(M = M_1) + P(error | M = M_2)P(M = M_2) \neq P(error | M=m)$$ where $M$ refers to the model to be selected, $M_i$ are the models, and $m$ is a realized value of $M$.
However, 'robust' p-values still maintain error control under incorrect models, provided that the focus of inference remains the same under each model¹. Hence, as our p-values control error under each model², the maximum error rate is still below whatever alpha we choose. Am I missing anything here?
¹ This does not occur generally, the expected value of the relevant parameter should be the same under each model
² It would seemingly still affect power calculations though