Let's say I have a dataset with 100,000 class A training observations and 400 class B training observations. I want to use Support vector machine for this binary classification problem. Instead of applying random undersampling or SMOTE, I want to apply a method as such: I will divide my class A observations into 400 distinct batches (100,000/400=4000). and add all of the 400 class B observations into each of the 400 batches. Then, I will take the average of all the results (accuracy,f1, average precision) obtained from each of the 400 observations.
Is following such a method completely wrong? Does it give me a very optimistic results? Or what are the possible misleading effects?
Thank you.