I want to attempt to use Support Vector Machines (SVMs) on my dataset. Before I attempt the problem though, I was warned that SVMs dont perform well on extremely unbalanced data. In my case, I can have as much as 95-98% 0's and 2-5% 1's.
I tried to find resources which talked about using SVMs on sparse/unbalanced data, but all I could find was 'sparseSVMs' (which use a small amount of support vectors).
I was hoping someone could briefly explain:
- How well SVM would be expected to do with such a dataset
- Which, if any, modifications must be done to the SVM algorithm
- What resources/papers discuss this