I think you can try to see these questions:
Can I perform an exhaustive search with cross-validation for feature selection?
Feature selection and cross-validation
It is talked about CV there, but it mostly same idea, when you reuse the data from train set for model fitting after feature selection algorithm it's basically like you train first simple algorithm with this data and then using it's results (via zeroing some non-random variables) again fit other model.
In real life it often can be not so criminal because with feature selection is not so easy to overfit unless you don't have model with millions noisy features.