3

For any dataset $D$, with the label space containing m labels. Using decision tree, can we calculate the maximum training error on that?

Yes I have to admit, it's a homework question and I have completely no clues on that. Could any one gives me some hint on that? And what is label, does it mean edges of the tree?

xxx222
  • 397
  • 5
  • 14
  • Hi Xupeng, welcome to Cross-validated. It is a good question. However, if it is a homework question, please add the tag "homework". – Simone Oct 01 '15 at 04:42
  • Also you can add latex notations if you use the \$ symbol. E.g. for any dataset \$D\$.. – Simone Oct 01 '15 at 04:43

1 Answers1

1

A decision tree is a classification model. You can train a decision tree on a training set $D$ in order to predict the labels of records in a test set. $m$ is the possible number of labels. E.g. $m = 2$ you have a binary class problem, for example classifying patients who might either have or not have a disease; $m > 2$ you have multi-class problem, for example if you had to classify news in "politics", "sport", and "culture".

Think about how decision trees work. I wrote an explanation here. The hint I can give to you is that in your homework attributes are not relevant to the class. Also, think about the way decision trees classify records on leaf nodes.

Simone
  • 6,513
  • 2
  • 26
  • 52
  • So does it mean the dataset that may meet the maximum training error here has to be the worst one (one with the highest entropy) ? – xxx222 Oct 01 '15 at 05:26
  • Yes, I think you have to think about the worst dataset where your decision tree is just a root node. – Simone Oct 01 '15 at 07:26
  • Do you mean considering one root node situation is good enough? I am confused because of I know nothing about the information of node, which centers in the decision tree theory. – xxx222 Oct 01 '15 at 07:36
  • I mean, what is the worst case for a decision tree? that attributes are totally unpredictive to the class. In this case you wouldn't even grow a decision tree. You would just have a single root node, which can also be seen as a single leaf node. Now you have just to think about how you classify records on a leaf node: majority voting. – Simone Oct 01 '15 at 07:39
  • Does it mean in the worst case we can't even select a proper feature to grow the tree? So what should I do now? Randomly allocate m classifications? What does the majority means here? – xxx222 Oct 01 '15 at 07:56
  • Yes, in the worst case you cannot even grow a tree. You don't randomly label the records, you label them with just single label, the most frequent label: this is majority class. – Simone Oct 01 '15 at 08:19
  • I think it would be (1-1/k)? – rohit_r Dec 12 '21 at 16:10
  • What is $k$? the solution here is $1 - \frac{\text{ number of records from majority class}}{\text{total number of records}}$ – Simone Dec 16 '21 at 18:56