6

Beside SVM, what are the classification models that can be trained by a dataset of only positive training examples? and which of these models are generally known to perform better in such cases?

UPDATE: I mean problems that are described by the following quoted sentences:

  • "One-class SVM is an unsupervised algorithm that learns a decision function for novelty detection: classifying new data as similar or different to the training set".
  • "But what if you only have data of one class and the goal is to test new data and found out whether it is alike or not like the training data?".
Tim
  • 108,699
  • 20
  • 212
  • 390
  • You mean binary classification problem? – Jack Shi Nov 03 '15 at 18:13
  • No, sorry, I mean single class. should I correct the title? – PatternRecognition Nov 03 '15 at 18:55
  • I see two votes to close this question as unclear - the question is *not* unclear. – Tim Nov 03 '15 at 18:56
  • I have just edited the question. I hope it is now clearer. – PatternRecognition Nov 03 '15 at 19:07
  • 1
    @PatternRecognition this is still unclear. I suggest you provide an example dataset to describe what you mean by "dataset of only positive training examples". To me this means you have some training data whose response is a binary, but whose values are all 1s (i.e. positive). In which case any sensible model will predict 1 (or NA I guess) for an unseen sample. – Ben Nov 03 '15 at 19:19
  • 1
    @Ben single-class classification is an existing name of whole class of methods, see: https://en.wikipedia.org/wiki/One-class_classification – Tim Nov 03 '15 at 19:58
  • @Tim +1. That's the first time I've ever heard of one-class classification. Thanks for clearing that up for me. – Ben Nov 03 '15 at 20:10

2 Answers2

1

There are plenty possibilities to construct one-class-classifiers. I wrote a number of simple algorithms in the context of authorship verification. Here, only positive samples of one author X are given, so that the task is to judge if a given document was written by X or not. However, it can be adapted to other fields besides authorship verification by just adjusting the features. Here are two of my papers:

Oren Halvani, Lukas Graner, Inna Vogel. Authorship Verification in the Absence of Explicit Features and Thresholds In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science, vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_34

O. Halvani and M. Steinebach, "An Efficient Intrinsic Authorship Verification Scheme Based on Ensemble Learning," 2014 Ninth International Conference on Availability, Reliability and Security, 2014, pp. 571-578, doi: 10.1109/ARES.2014.84.

Sycorax
  • 76,417
  • 20
  • 189
  • 313
NeuroMorphing
  • 525
  • 2
  • 12
0

This is generally called One-Class Classification, Single-Class Classification, Outlier Detection or even Support Determination (i.e., what is the support of a distribution).

These generally attempt to solve the problem of low-density rejection (i.e., rejecting points that fall in areas where the training data has low probability).

See here for some theory, it also references various technical approaches and surveys, although this is an area of active research.

MotiNK
  • 1,224
  • 6
  • 14