6

I've run a Random Forest in R using randomForest package.

The fitted forest I've called: fit.rf.

All I want to know is: When I type fit.rf the output shows '% var explained' Is the % Var explained the out-of-bag variance explained?

Karolis Koncevičius
  • 4,282
  • 7
  • 30
  • 47
jc52766
  • 61
  • 1
  • 1
  • 2

1 Answers1

10

Yes %explained variance is a measure of how well out-of-bag predictions explain the target variance of the training set. Unexplained variance would be to due true random behaviour or lack of fit.

%explained variance is retrieved by randomForest:::print.randomForest as last element in rf.fit$rsq and multiplied with 100.

Documentation on rsq: rsq (regression only) “pseudo R-squared”: 1 - mse / Var(y). Where mse is mean square error of OOB-predictions versus targets, and var(y) is variance of targets.

see this answer also.

  • Would not have thought that was possible, as then either model variance('mse') or total variance('Var(y)') would have to be negative. I'd like to see a code that can produce a >100% performance. It can be negative though when model variance is larger than total variance. The implications thereof is then you likely would be better of with off with a simple average than a random forest model. – Soren Havelund Welling Nov 03 '15 at 16:51
  • [note] above comment was a small answer as some body asked: "What if explained variance is more than 100%" and the person later deleted the comment again – Soren Havelund Welling Nov 05 '15 at 10:21