I have a dataset with about 1M negative examples and 4700 positive examples. I'm trying to create a classifier that tries to predict the % of an example being positive.
Given how much the data is skewed, should I just give up or are there algorithms that perform well with skewed data?