Is there a way to use logistic regression to classify multi-labeled data? By multi-labeled, I mean data that can belong to multiple categories simultaneously.
I would like to use this approach to classify some biological data.
Is there a way to use logistic regression to classify multi-labeled data? By multi-labeled, I mean data that can belong to multiple categories simultaneously.
I would like to use this approach to classify some biological data.
I principle, yes - I'm not sure that these techniques are still called logistic regression, though.
Actually your question can refer to two independent extensions to the usual classifiers:
You can require the sum of all memberships for each case being one ("closed world" = the usual case)
or drop this constraint (sometimes called "one-class classifiers")
This could be trained by multiple independent LR models although one-class problems are often ill-posed (this class vs. all kinds of exceptions which could lie in all directions) and then LR is not particularly well suited.
partial class memberships: each case belongs with membership $\in [0, 1]^{n_{classes}}$ to each class, similar to memberships in fuzzy cluster analysis:
Assume there are 3 classes A, B, C. Then a sample may be labelled as belonging to class B. This can also be written as membership vector $[A = 0, B = 1, C = 0]$. In this notation, the partial memberships would be e.g. $[A = 0.05, B = 0.95, C = 0]$ etc.
different interpretations can apply, depending on the problem (fuzzy memberships or probabilities):
for prediction, e.g. the posterior probabilities are not only possible but actually fairly common
and even validation
The whole idea of this is that for borderline cases it may not be possible to assign them unambiguously to one class.
In R e.g. nnet:::multinom
which is part of MASS does accept such data for training. An ANN with logistic sigmoid and without any hidden layer is used behind the scenes.
I developed package softclassval
for the validation part.
One-class classifiers are nicely explained in Richard G. Brereton: Chemometrics for Pattern Recognition, Wiley, 2009.
We give a more detailed discussion of the partial memberships in this paper: Claudia Beleites, Kathrin Geiger, Matthias Kirsch, Stephan B Sobottka, Gabriele Schackert & Reiner Salzer: Raman spectroscopic grading of astrocytoma tissues: using soft reference information. Anal Bioanal Chem, 2011, Vol. 400(9), pp. 2801-2816
One straightforward way to do multi-label classification with a multi-class classifier (such as multinomial logistic regression) is to assign each possible assignment of labels to its own class. For example, if you were doing binary multi-label classification and had 3 labels, you could assign
[0 0 0] = 0
[0 0 1] = 1
[0 1 0] = 2
and so on, resulting in $2^3 = 8$ classes.
The most obvious problem with this approach is you can end up with a huge number of classes even with a relatively small number of labels (if you have $n$ labels you'll need $2^n$ classes). You also won't be able to predict label assignments that aren't present in your dataset, and you'll be making rather poor use of your data, but if you have a lot of data, and good coverage of the possible label assignments, these things may not matter.
Moving beyond this and what was suggested by others, you'll probably want to look at structured prediction algorithms such as conditional random fields.
This problem is also related to cost sensitive learning where predicting a label for a sample can have a cost. For multi-label samples the costs for those labels is low while the cost for other labels is higher.
You can take a look at this tutorial which you can also find the corresponding slides here.