I could not find a good answer/reference that can explain why rf/decision trees/gbm are not susceptible to the scale of values of numerical variables.
My sense is that since boosting methods penalize more if the error is large so they should certainly be susceptible to scale of the feature variables.
I have a dataset between 0-100 and some values an order of magnitude larger, in the range of 1000's. Should i scale them?
Based on your experience, does it help to scale features in tree based algos?