p-values for feature selection

Asked Nov 19 '15 at 08:16

Active Feb 12 '18 at 22:18

Viewed 711 times

I am doing multiple regression analysis, in which i want to eliminate some of the insignificant features. In most of the machine learning books subset selection, shrinkage methods or PCA is used for reducing number of feature. Why p-values are not commonly used for feature selection?

edited Feb 12 '18 at 22:18

kjetil b halvorsen

63,378
26
142
467

asked Nov 19 '15 at 08:16

Siddhesh

5

[This is why.](http://stats.stackexchange.com/a/20856/1352) (Whether you do it in a stepwise manner or all at once doesn't change the fundamental problem.) – Stephan Kolassa Nov 19 '15 at 08:20
@Stephan : I read the answer. Does it imply p-values should never be used? – Siddhesh Nov 19 '15 at 12:40
2

No. You can use and interpret p values if you use them correctly. [This is a good place to start understanding them](http://stats.stackexchange.com/questions/tagged/p-value?sort=votes&pageSize=50). In your specific case, if you look at multiple models (by selecting features), the p values will not be uniformly distributed under the null hypothesis any more, so you either need to find their new distribution (e.g., through simulation) or interpret them differently. – Stephan Kolassa Nov 19 '15 at 12:45

p-values for feature selection

0 Answers0

Linked