I have a difficult time finding any theoretical or empirical comparative research regarding the tree-based gradient boosting algorithms on data sets with different underlying properties. Is there any reason to believe that one of them is better or worse at handling linearity, nonlinearity, noise, or categorical features(other than code related issues)? Is there any other potential properties in data sets that are more likely to affect the performance of these boosting techniques?
Asked
Active
Viewed 81 times