0

I am doing multiple regression analysis, in which i want to eliminate some of the insignificant features. In most of the machine learning books subset selection, shrinkage methods or PCA is used for reducing number of feature. Why p-values are not commonly used for feature selection?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Siddhesh
  • 676
  • 1
  • 6
  • 15
  • 5
    [This is why.](http://stats.stackexchange.com/a/20856/1352) (Whether you do it in a stepwise manner or all at once doesn't change the fundamental problem.) – Stephan Kolassa Nov 19 '15 at 08:20
  • @Stephan : I read the answer. Does it imply p-values should never be used? – Siddhesh Nov 19 '15 at 12:40
  • 2
    No. You can use and interpret p values if you use them correctly. [This is a good place to start understanding them](http://stats.stackexchange.com/questions/tagged/p-value?sort=votes&pageSize=50). In your specific case, if you look at multiple models (by selecting features), the p values will not be uniformly distributed under the null hypothesis any more, so you either need to find their new distribution (e.g., through simulation) or interpret them differently. – Stephan Kolassa Nov 19 '15 at 12:45

0 Answers0