I couldn't find any information in the documentation of rapidminer. I have a data set with the following attributes:
a,b,c,d,e
.
The types are: numerical
, binomial
, binomial
, binomial
, binomial
.
Binomial values are given as {true, false}
.
The last one is the label I want to be able to predict. So the value to predict is a true/false decision. I understand that I have to use logistic regression for that.
My module chain looks like this.
Read CSV -> Set Role -> Nominal To Binary -> Classification by Regression
Set Role
: I can only set one attribute as the predictor attribute. How can I set all other input attributes as predictor attribute too?
Nominal to Binary
: The binomial values are given as Strings "true" and "false". That's why I do the conversion here.
The output for MultiModelByRegression is:
MultiModelByRegression (prediction model for label is_helpful)
16.889 * b ^ 1.000
+ 9.553 * c ^ 2.000
+ 0.102 * a ^ 1.000
- 71.438
76.078 * b ^ 4.000
+ 38.618 * c ^ 1.000
+ 0.082 * a ^ 1.000
- 88.701
The Performance Vector Output
is:
true false true true class precision
pred. false 2706 129 95.45%
pred. true 636 40 5.92%
class recall 80.97% 23.67%
I know how to interpret the above.
All of this is done with the training set, which is already labelled with the correct classes. How do I apply the test set at this point? I suppose I need the test set to somehow evaluate the results of my classifier, right? Anyway, I am really confused here and I would appreciate any kind of help.