4

A lot is written about class imbalance in machine learning (for example on this site here).

However, how to deal with "intra class" imbalance?

Assume I want to classify Bikes v.s. Cars. My training/test data is 50% about bikes and 50% cars (no class imbalance). However of this 50% cars, I have 1% Formule 1 cars, 70% small cars, 20% SUVs, 7% pick up cars and 2% jeeps.

There is no class imbalance, but what I call 'intra class' imbalance.

My question:

  • Is this from a ML/statistics point of view a problem? Why is it or why is it not?
  • What is the name/keyword for this problem what I call 'intra class imbalance'?
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
robert
  • 881
  • 1
  • 9
  • 12
  • I would regard this as regular class imbalance since it will have the same type of effects. To be dealt with in the same fashion as regular class imbalance. – user2974951 Dec 10 '18 at 07:30
  • What would be the issues in using the same solution as inter class imbalance problems? – discipulus Dec 11 '18 at 01:44

0 Answers0