Resources for learning about multiple-target techniques?

Question

I am looking for resources (books, lecture notes, etc.) about techniques that can handle data that have multiple-targets (Ex: three dependent variable: 2 discrete and 1 continuous).

Does anyone have any resources/knowledge on this? I know that it is possible to use neural networks for this.

score 10 · Answer 1 · edited Jun 04 '18 at 17:57

Random forest handle it rather well, see Would a Random Forest with multiple outputs be possible/practical? or scikit learn's documentation. I guess GBM or any tree based method can be adapted in a similar fashion.

More generally, when you run any learning algorithm minimizing a score, you usually work on minimizing $\sum_i(p_i-y_i)^2$ which is one-dimensional. But you can specify any target function. If you were working on (two-dimensional) position prediction, $\sum_i(\hat{y}_i-y_i)^2+(\hat{x}_i-x_i)^2$ would be a good metric.

If you have mixed type output (classification and regression) then specifying the target function will probably require you to specify a target function that gives more weight to some targets than other: which scaling do you apply to continuous responses ? Which loss do you apply to miss-classifications?

As for further academic reading,

SVM Structured Learning's Wikipedia

Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

The Landmark Selection Method for Multiple Output Prediction (deals with high dimensional dependent variables)

Given multi-target regression also intends to model the relationships between the Ys, wouldn't you want a loss function that measures the fit of that relationship? — Max Ghenis, Aug 22 '18 at 15:42

score 3 · Answer 2 · answered Sep 14 '16 at 04:53

This paper does a good job of describing the current methods, toolkits available, as well as datasets to test on.

I happen to work on a commercial problem requiring multi-target regression, and I found that the Clus toolkit has a good blend of high performance and robustness

The documentation is excellent
The toolkit has several methods for both multi-target classification and regression
It also support rule-based induction and clustering.
The ensemble models (Bagging, RandomForest) that I used can be read and interpreted easily.

Some of the newer methods (post 2012) have been implemented as an extension of the Mulan toolkit, here's the Github link. Although these methods such as Random Linear Target Combinations report better performance than ensemble models, I found that the toolkit not as mature as the Clus toolkit and hence didn't use them.

conjectures · Answer 3 · 2015-10-12T10:31:57.630

0

A Bayesian take on this sort of problem: Bayesian non‐parametric models for spatially indexed data of mixed type. The multiple response element being handled by various normally distributed random vectors and link functions thereof. So that the complete response is a stack of a vector of normals, vector of counts and vector of bernoullis.

edited Oct 12 '15 at 10:31

answered Oct 12 '15 at 10:25

conjectures

3,971
19
36

Resources for learning about multiple-target techniques?

3 Answers3

Linked