In a SVM with linear kernel, could you explain to me what exactly the C parameter is/represents?
An example why it's important to select a good value for C would also be appreciated.
Thank you.
In a SVM with linear kernel, could you explain to me what exactly the C parameter is/represents?
An example why it's important to select a good value for C would also be appreciated.
Thank you.
The $c$ parameter tells the algorithm how to balance the two competing objectives which are to maximize the margin between the two classes and to not allow any samples to be misclassified. If $c=0$ then the algorithm does not allow any samples to be misclassified. If your data is not linearly separable then the algorithm will not be able to find a separating hyperplane. If $c>0$ then the algorithm can trade-off some misclassified samples in-order to find a margin that better separates the remaining points.
You should try a variety of values for $c$ and see which one works best.