4

I am new to machine learning and SVMs. I have a general question regarding the optimization of parameters in one-class SVM in libsvm in R. I found similar posts but yet not conclusive answer. Can someone tell me, do I need to optimize both c (cost) and nu parameters in one-class SVM?

Actually my preliminary results using e1071 package (libsvm implementation in R) show that predictions (at least in terms of number of positive versus negative classifications) are dependent only from nu and not from c.

Thanos
  • 41
  • 1
  • 2
  • 1
    Please register &/or merge your accounts (you can find information on how to do this in the **My Account** section of our [help]), then you will be able to edit & comment on your own question. – gung - Reinstate Monica Feb 15 '16 at 15:36
  • Specifically, please [merge](http://stats.stackexchange.com/help/merging-accounts) your [first](http://stats.stackexchange.com/users/105210/thanos) and [second](http://stats.stackexchange.com/users/105211/user2470873) accounts – Glen_b Feb 15 '16 at 16:24

2 Answers2

2

The C parameter is used for multiple classification. The nu parameter is used in one-class classification, so as long as you optimize nu you are fine. nu is the upper bound in the fraction of training points. Please check this link

Depending on the kernel function you choose, you will have to optimize gamma too. The hyperplane varies depending on both parameters. If you choose the linear kernel, then there is no need to optimize gamma, as it uses the default value=0 where the gamma value is equal to 1/n (n is the number of features).

To tune your parameters, you can use the tune.svm function:

tuned <- tune.svm(x=yourFeatures, y=yourLabelY, data = yourData, 
              nu =  0.001:0.5,
              gamma = 10^(-2:0), 
              type='one-classification'                 
              );

Then you can use the best values to design your one-class svm model:

model <-svm(x=yourFeatures, y=yourLabelY,type='one-classification', 
            nu = tuned$best.parameters$nu, 
                    gamma = tuned$best.parameters$gamma
        );

This means your nu value will be the best value between 1% and 50%, meaning you will allow up to 50% of outliers (but you want the best value).

Turay Melo
  • 21
  • 2
  • Hi, thanks for the answer. Just one question:`nu = 0.001:0.5,..., This means your nu value will be the best value between 1% and 50%` shouldn't this mean a value between *0.1%* and 50%, or am I getting something wrong? – mkaran Nov 28 '16 at 10:03
  • Hi @mkaran. Sorry for late answer, but you are right. It is from 0.1% to 50%. Regards – Turay Melo May 06 '17 at 15:35
1

nu and C are alternative constants from alternative formulations of the SVM. You have to choose one or the other. If you are using the svm function of e1071 please check the type parameter in http://www.rdocumentation.org/packages/e1071/functions/svm

type="C-classification" which is the default will use the C-SVM formulation which requires the C parameter, which is called cost in the function

type="nu-classification" uses the nu-SVM formulation which requires setting the nu parameter

Jacques Wainer
  • 5,032
  • 1
  • 20
  • 32