1

I have a data which has only a single class, namely, '0'. There is no 'not 0' class.

The one-class SVM model was trained on a train dataset containing only a single class '0'. I do not unnecessarily want to find a random 'not 0' class to include in test dataset for prediction.

What will be the output of this approach? How can we interpret the result? What if the specificity is 0? Is it normal or have sensitivity and specificity to be 0? In that case how do we plot ROC curve?

After training it only on '0' class, I tested it on only '0' class (unseen and unlabeled data), and the model still gave '-1' for a few samples. Why did not it give all '1'?

I will appreciate an example on an arbitrary data.

Hello World
  • 163
  • 1
  • 2
  • 10

1 Answers1

1

If your entire data only has the "0" class, then life is easy: just classify everything as "0". Any tool or method will do so, too. (If a method, upon seeing only "0" instances, classifies something as "rhubarb", I would question its sanity.)

If you classify everything as "0", and everything is in fact "0", then every instance is a true positive. There are no false positives, true or false negatives. Sensitivity is $\frac{n}{n}=1$, specifity is undefined, $\frac{0}{0}$.

As to the ROC curve: there is no threshold to tune, the FPR is constant at zero, the TPR is constant at one, and your ROC curve degenerates to a point in the top left hand corner. AUROC is 1.

And to be honest, everything is maximally useless. If you already know everything is of just one class, why bother modeling?

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
  • So what is the correct way to make sense? I guess, train on '0' class, and test on '0' plus 'not 0' class? Can you present an example? – Hello World Jul 03 '19 at 13:53
  • Do you know if there exist anomaly/novelty/outlier detection method which works directly on images and not on image descriptors? – Hello World Jul 03 '19 at 13:56
  • By the way after training it only on '0' class, I tested it on only '0' class (unseen and unlabeled data), and the model still gave '-1' for a few samples. Why did not it give all '1'? – Hello World Jul 03 '19 at 14:00