0

Working on a random forest predictive model with a continuous response variable and two continuous features. Normally when I do RF projects I use some sort of feature selection method to choose which features to use. Then I fit the RF model onto those features. Then to test accuracy / related metrics I use cross validation, confusion matrices, etc.

However in this case I only have two given features. I don't want to just literally run a RF model on those two features as my whole entire project. I'm thinking gradient boosting is what I should learn? Also I think I should play around with the number of estimators and depth of the RF. I'm using sklearn in Python if that helps.

Any other suggestions? Obviously this type of problem/challenge is an unexplored area for me, so looking for best practices on how to add to my data science toolkit. Thanks!

matta
  • 1
  • 2
    Related-if-not-outright-duplicate: [How to know that your machine learning problem is hopeless?](https://stats.stackexchange.com/q/222179/1352) – Stephan Kolassa Sep 10 '19 at 06:33
  • Why would you even use random forest for such a low-dimensional problem? I would plot the data and then try to fit an appropriate parametric model. – Scholar Sep 13 '19 at 12:33

0 Answers0