39

I am using SVM for classification and I am trying to determine the optimal parameters for linear and RBF kernels. For the linear kernel I use cross-validated parameter selection to determine C and for the RBF kernel I use grid search to determine C and gamma.

I have 20 (numeric) features and 70 training examples that should be classified into 7 classes.

Which search range should I use for determining the optimal values for the C and gamma parameters?

Ferdi
  • 4,882
  • 7
  • 42
  • 62
Kywia
  • 391
  • 1
  • 3
  • 3

2 Answers2

37

Check out A practical guide to SVM Classification for some pointers, particularly page 5.

We recommend a "grid-search" on $C$ and $\gamma$ using cross-validation. Various pairs of $(C,\gamma)$ values are tried and the one with the best cross-validation accuracy is picked. We found that trying exponentially growing sequences of $C$ and $\gamma$ is a practical method to identify good parameters (for example, $C = 2^{-5},2^{-3},\ldots,2^{15};\gamma = 2^{-15},2^{-13},\ldots,2^{3}$).

Remember to normalize your data first and if you can, gather more data because from the looks of it, your problem might be heavily underdetermined.

ciri
  • 1,123
  • 9
  • 21
13

Check out section 2.3.2 of this paper by Chapelle and Zien. They have a nice heuristic to select a good search range for $\sigma$ of the RBF kernel and $C$ for the SVM. I quote

To determine good values of the remaining free parameters (eg, by CV), it is important to search on the right scale. We therefore fix default values for $C$ and $\sigma$ that have the right order of magnitude. In a $c$-class problem we use the $1/c$ quantile of the pairwise distances $D^\rho_{ij}$ of all data-points as a default for $\sigma$. The default for $C$ is the inverses of the empirical variance $s^2$ in features space, which can be calculated by $s^2 = \frac{1}{n} \sum_i K_{ii} - \frac{1}{n^2}\sum_{i,j} K_{ij}$ from a $n\times n$ kernel matrix $K$.

Afterwards, they use multiples (e.g. $2^k$ for $k\in \{-2,...,2\}$) of the default value as search range in a grid-search using cross-validation. That always worked very well for me.

Of course, we @ciri said, normalizing the data etc. is always a good idea.

fabee
  • 2,403
  • 13
  • 18
  • I think there are several equal rbf kernel formulations. One with gamma and another with sigma, i.e. gamma = 1/2sigma^2. Does the gamma in the above heuristic correspond to gamma, sigma or sigma^2? I have found other descriptions of the same heurstic which are for gamma. – machinery Apr 30 '16 at 15:25
  • If you check the linked paper, it is $\frac{1}{2\sigma^2}$ – fabee May 01 '16 at 17:38
  • @fabee Should peer testing be done manually? there is not a library to achieve it? – userStack Aug 11 '18 at 13:40