SVM cost parameter

Question

In a SVM with linear kernel, could you explain to me what exactly the C parameter is/represents?

An example why it's important to select a good value for C would also be appreciated.

Thank you.

score 3 · Accepted Answer · answered Jul 06 '14 at 13:17

3

The $c$ parameter tells the algorithm how to balance the two competing objectives which are to maximize the margin between the two classes and to not allow any samples to be misclassified. If $c=0$ then the algorithm does not allow any samples to be misclassified. If your data is not linearly separable then the algorithm will not be able to find a separating hyperplane. If $c>0$ then the algorithm can trade-off some misclassified samples in-order to find a margin that better separates the remaining points.

You should try a variety of values for $c$ and see which one works best.

answered Jul 06 '14 at 13:17

Aaron

3,025
14
24

Thank you Aaron! One follow-up question: To determine which C is best, should I evaluate the models (each with a different C) based on accuracy percentage (i.e. correctly classifying points from the validation data set) or should I use something else? – Glenn Jul 07 '14 at 08:56
Yes you should do exactly that. – Aaron Jul 07 '14 at 14:22
Thanks! So in the case of C>0, would high C values result in smaller margins (even though a SVM tries to maximize that) and less misclassifying because misclassifications result in a higher 'penalty?' ? – Glenn Jul 07 '14 at 16:42

SVM cost parameter

1 Answers1

Linked

Related