I found these sentences:
PCA before random forest can be useful not for dimensionality reduction but to give you data a shape where random forest can perform better.
I am quiet sure that in general if you transform your data with PCA keeping the same dimensionality of the original data you will have a better classification with random forest
from this page: PCA on high-dimensional text data before random forest classification?
In my case I found this is really true, for a regression problem with a database of ~1M records and 25 predictors. The error decreases by about 10% if I use the 25 PCA as predictors instead of the 25 original predictors.
Can anyone help me in understanding and clearly interpreting this result?