1

I am running different algorithms (SVM, Random Forest, other tree-based decision algos).

I noticed that my performance went down (error went up) after adding new features.

performance(Feature_family_1 + Feature_family_2) < performance(Feature_family_1)

Is this possible, and how can I make sure I don't lose information? Should I use PCA?

Timothée HENRY
  • 821
  • 2
  • 11
  • 24

1 Answers1

0

Duplicate of:

Why does increasing the number of features reduce performance?

where I found this explanation very good:

adding irrelevant predictors can worsen performance on new data - increased variance of the prediction (over fitting). This is because you end up fitting to noise and dilute the "true signal".

Also:

In addition with the dimension you might have increasing possibility of having correlated features which is not good for lots of learning algorithms

Timothée HENRY
  • 821
  • 2
  • 11
  • 24