So I'm trying to train a neural network for a rare event detection. Based on that, I have like 1000 times more examples for non-target (everything else) examples that I have for target examples. So I was wondering if I just repeat the set of target examples untill I get balanced examples, what effect would that have on my classification and generalisation performance ? Would I gain anything ? What price would I be paying this way ?
And in general what is the best/most common way to deal with these situations ? I can't do boosting or bagging as I can not afford to train several models. I have computational resource and memory restrictions and I will have only one model (e.g. one neural network) for desicion making at test time.
Thanks !
* My question is clearly what is the effect of duplicating examples of a target class by as many copies as required to balance the number of examples on both classes. More specifically I am using neural networks as the model. What do I gain or lose if I do such ?