Is there a minimum variance explained for a random forest model, below which variable importance shouldn't be interpreted?

Asked Nov 03 '21 at 23:35

Active Nov 03 '21 at 23:35

Viewed 15 times

I am running several random forest regression models on different datasets. In each, I have a continuous DV and ~30 dichotomous predictors. I don't expect these predictors to explain much variance. What I am really interested in is which ones are related to the dependent variable.

In some datasets, the model predicts ~5% of the variance, which is about what I would expect. But in others, it is < 1% and is sometimes negative.

This made me wonder if there is a certain minimum explained variance, below which the importance of predictors in a model shouldn't be interpreted?

asked Nov 03 '21 at 23:35

Dave

1,641
2
14
27

2

How do you measure explained variance? – Dave Nov 04 '21 at 00:04
@Dave I'm using the randomForests package, and part of the output is a "% of variance explained". I believe it is an R2. – Dave Nov 04 '21 at 01:32
2

(Nice username) [As a heads up, $R^2$ for nonlinear regression models like random forests does not correspond to the “percentage of variance explained” that it does in the linear case.](https://stats.stackexchange.com/a/547870/247274) – Dave Nov 04 '21 at 01:41

Is there a minimum variance explained for a random forest model, below which variable importance shouldn't be interpreted?

0 Answers0