The intuitive explanation for the gamma
parameter of the RBF kernel in SVMs is the following:
Intuitively, the
gamma
parameter defines how far the influence of a single training example reaches, with low values meaning ‘far’ and high values meaning ‘close’. Thegamma
parameters can be seen as the inverse of the radius of influence of samples selected by the model as support vectors.
https://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#rbf-svm-parameters
In sklearn.svm.SVC
the default value of the parameter gamma
is 'scale'
, i.e. gamma = 1 / (n_features * X.var())
. What is the explanation for this default choice of gamma
and why does it work so well (at least for my dataset, I couldn't beat this value with extensive grid-search for gamma
)?