Comparing different machine learning models for extrapolation

Asked Dec 18 '18 at 05:58

Active Dec 18 '18 at 05:58

Viewed 486 times

I am trying to fit a non linear regression model on a set of data points which I know is incomplete. When visualizing the data, the relationship looks quite simple between my features and dependent variables (~3 degree polynomial). Within the range of the data, I am finding little out of sample difference between the predictive power of ANN, SVR, Boosted Trees etc.

However, should I expect to encounter points outside the range of my sample, which classes of models should I use for better performance? Intuitively it seems that Trees should be avoided entirely? SVR, forcing C to be low might be the best among bad choices? Are there any theoretical insights or best practices for this?

asked Dec 18 '18 at 05:58

hjw

If you have domain knowledge about what you can expect then yes, you could choose a better option. If you have no knowledge about what to expect then it's anybodies guess. – user2974951 Dec 18 '18 at 08:18
Ensemble models (like Random Forests) will bind predictions to be in the training-set range of values. So yes, sometimes, this is something that you don't want. – daruma Oct 06 '21 at 23:57

Comparing different machine learning models for extrapolation

0 Answers0