While performing MLR & Logistic Regression model summary analysis I have seen the problem of perfect multicolinearity if we use one hot encoding without dropping a single feature. Is it necessary to drop a single feature along with one hot encoding for all types of algorithms ? I got an intuition that apart from MLR & Logistic Regression Algorithms, the concept of dummy trap doesn't exist for other algorithms like SVM, Decision Tree, Random Forest, KNN, Kmeans, AdaBoost, GradientBoost (Classification & Regression), XGBoost (Classification & Regression). I need proper & specific guidance about this.
Asked
Active
Viewed 26 times
1

kjetil b halvorsen
- 63,378
- 26
- 142
- 467

Morphine691
- 11
- 1
-
It is not clear (to me) what your doubt is. Encoding factors with dummys/one-hot is a linear algebra technique, so is relevant whenever you have some linearity in the model ... but seems to be used also, at times,with models without linear structure, like trees. Maybe see https://stats.stackexchange.com/questions/390671/random-forest-regression-with-sparse-data-in-python/430127#430127 – kjetil b halvorsen Feb 01 '21 at 12:47