1

I am trying to classify a target group from controls using a SVM. I am predicting probabilities, and noticed that when predicting the target class, the SVM performance was horrible (AUC ~0.2). This made me think, if I predicted the control class, then shouldn't the performance improve - which it did.

Essentially, I have an algorithm that accurately predicts the opposite of what I want.

Is there anything inherently wrong with this, and would it be ok to use the probability that the target group are controls as a way of classifying the target group?

Additionally, does anyone have any idea why the performance is poor predicting the targets, but good at predicting controls?

jmero
  • 11
  • 1
  • 1
    welcome to the forum. IIUC.can you try inverting labels? map 0 to 1 and 1 to 0. I had similar problem and later inverted the labels (and it did show some improvement). not sure whether this really addresses your problem – The Great Feb 24 '22 at 06:58
  • Thank you for the welcome! Just tried that and it has worked, but that seems quite strange - do you have any idea why that is the case? – jmero Feb 24 '22 at 07:08
  • I just asked this question few days ago. you can refer this post - https://stats.stackexchange.com/questions/563545/why-metrics-focus-on-maximizing-only-majority-class-in-binary-classification – The Great Feb 24 '22 at 07:40
  • Does this answer your question? [Why metrics focus on maximizing only majority class in binary classification?](https://stats.stackexchange.com/questions/563545/why-metrics-focus-on-maximizing-only-majority-class-in-binary-classification) – Karolis Koncevičius Feb 26 '22 at 07:42

0 Answers0