It is well known that it is problematic to choose a statistical test based on the outcome of another statistical test, as the p-values are difficult to impossible to interpret (e.g. Choosing a statistical test based on the outcome of another (e.g. normality)). However, this is still standard practice in many applications and usually does not seem to be noticed or discussed in applied papers. Looking through the literature, I failed to find a paper that actually discusses this phenomenon.
I would appreciate links to any publications relating to choosing a statistical test based on the outcome of another statistical test, especially any that are accessible to applied scientists.