I know that you should separate your data into training and validation sets before doing feature selection, to avoid getting too good, false results on cross validation.
But, I have seen people say that you should also avoid doing feature selection on the same data set that you train your model on, as to avoid overfitting on that data.
What some suggest is that you split your training data into 3 sets, a training, validation and testing dataset. You train your model on the training dataset, do feature selection on the validation set, and evaluate your model on the testing dataset.
Is this overfitting really something to worry about, and if yes, is the above method a good way to deal with it?