I am looking for some suggestions on what methods are appropriate for training a dataset with a high skew in the outcome classes. The ratio of Class 0: Class 1 is about 20:1 and I am looking to maximize the accuracy for identifying Class 1 outcomes. This is similar to oft discussed topics such as cancer detection.
I have used some methods before but am trying to find if there is any comprehensive resource / suggestions that talks to the different methods for these cases. Examples of how they are applied in R (packages, etc) or with caret would be useful. It is a sparse dataset with about 100K examples of which 5000 belong to Class 1 and the rest to Class 0. Each example has about 20 features, and includes null values.