2

I am using the model SVR to create a regression model. This class contains several hyperparameters, and to try to find the best ones according to the several features, I am using the library hyperopt. I am using hyperopt to find the best hyperparameters.

When using hyperopt, it is necessary to define the search space for those hyperparams. I don't know what parameters should I put for the SVR, so I am using all the possible values. E.g., I define all the possible kernels in the search space ('linear', 'poly', 'rbf', 'sigmoid') for SVR, like it is shown in the snippet. Yet, there are some hyperparameters like the regularization hyperparameter (C), or epsilon, that if you define a big range, hyperopt can take ages until it finds the best values.

I am guessing the best approach to reduce the amount of time for hyperopt compute is to define a smaller search space, but for that, we need to know what intervals to choose, or what options to select. And it is in this part that I don't know what to do. E.g., is the interval [0,1] good enough for C, or it should be [0.001, 0.1]?

So, my question is how to define a smaller search space before running hyperopt? There are techniques to do it? Where can I get more information about it?

    list_kernel = ['linear', 'poly', 'rbf', 'sigmoid']
    list_gamma = ['scale', 'auto']
    space = {
        'kernel':  hp.choice('kernel', list_kernel),
        'degree':  hp.uniform('degree', 3, 6),
        'gamma': hp.choice('gamma', list_gamma),
        'C': hp.uniform('C', 0, 1)

    }

PS: This is a general question. I am giving an example with SVR, but if we use XGBoost, it has much more hyperparameters.

Sycorax
  • 76,417
  • 20
  • 189
  • 313
xeon123
  • 225
  • 2
  • 6

1 Answers1

1
  • Polynomial degree is only used for the polynomial kernel, so "tuning" degree while using the RBF kernel is just wasted time.
  • Gamma is only used for the RBF kernel, so tuning it with a polynomial kernel is likewise a waste. Gamma can be any positive number; the heuristic values 'scale' and 'auto' might or might not be useful for your problem.
  • C can be any positive number; the range to $[0,1]$ is implausibly restrictive.

Given these facts, and your stated goal of reducing the time it takes to tune the SVR, my recommendation is to just use a RBF kernel and only tune C and gamma. RBF kernels generally perform well on a wide range of problems, so it's a safe default. Some suggestions for $C$ and $\gamma$ in the case of the RBF kernel: Which search range for determining SVM optimal C and gamma parameters?

Sycorax
  • 76,417
  • 20
  • 189
  • 313