2

I have a small number of samples and large number features. For doing the feature selection I'm going to divide my total set into a feature selection set and a test set.I run the t-test on the former and I test on the latter. Now my question is, assuming that some features appear to be significant in both sets, now that I have confirmed the feature significance, can I remerge the two sets and used the larger set for model training, using the extracted features? Will the test still be blind? Thanks

Glen_b
  • 257,508
  • 32
  • 553
  • 939
Theoden
  • 407
  • 3
  • 11

0 Answers0