Random forest: Out of bag error increases with number of treees

Question

I am running a random forest on a binary classification variable using about 30 explanatory variables. Please have a look at the screenshot below. You see that the out-of-bag errors for class 2 increases as the number of trees increases. This looks weird, as I expect the out of bag error to decrease as more trees are added.

Can somebody explain this or point me in the right direction?

Data, software , code could help clarify. It would say OBB err looks rather constant. Still unusual I admit. — Soren Havelund Welling, Nov 26 '15 at 07:43
The random forest was performed in R. More info on the data: the data are related to HIV patients following a specific treatment. The dependent variable is whether or not they have taken their pill at a specif point of time. The explanatory variables are patient specific characterstics such as age, gender, etc. together with extra information about previous dates: has the patient taken its pill yesterday? Did he reported symptoms yesterday? — user3387899, Nov 26 '15 at 08:11
Ok, nice :) try to update your question with command line which you run the model with, and show a header of the data.frame. On explanation could be that the model has not predictive power and adding more trees do not change this. Moreover, the target classes of the training data is skewed something like 1:4, but I'm just guessing — Soren Havelund Welling, Nov 26 '15 at 09:08
How many instances do you have in your training set? I would try to run random forests with different random seeds. The estimations of the OoB with one single tree is pretty unreliable. — Simone, Nov 26 '15 at 22:56
In light of the size of your data set and number of predictors, you need to grow a forest of at least 1000 trees before even being able to try to look at the OOB error rate — Antoine, Dec 06 '15 at 17:34
I agree with @Antoine. even the default number of trees would be helpful - 200. Using 20 trees is barely an ensemble. [link](http://stats.stackexchange.com/questions/164048/can-random-forest-be-used-for-feature-selection-in-multiple-linear-regression/164250#164250) — EngrStudent, Dec 14 '15 at 03:47

Random forest: Out of bag error increases with number of treees

0 Answers0