11

Is there a way that once a complex classification tree is constructed using rpart (in R), to organize the decision rules produced for each class? So instead of getting one huge tree, we get a set of rules for each of the classes?

(if so, how?)

Here is a simple code example to show examples on:

fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)

Thanks.

Tal Galili
  • 19,935
  • 32
  • 133
  • 195

2 Answers2

9

Such a functionality (or a close one) seems to be available in the rattle package, as described in RJournal 1/2 2009 (p. 50), although I only checked it from the command-line.

For your example, it yields the following output:

 Rule number: 3 [Kyphosis=present cover=19 (23%) prob=0.58]
   Start< 8.5

 Rule number: 23 [Kyphosis=present cover=7 (9%) prob=0.57]
   Start>=8.5
   Start< 14.5
   Age>=55
   Age< 111

 Rule number: 22 [Kyphosis=absent cover=14 (17%) prob=0.14]
   Start>=8.5
   Start< 14.5
   Age>=55
   Age>=111

 Rule number: 10 [Kyphosis=absent cover=12 (15%) prob=0.00]
   Start>=8.5
   Start< 14.5
   Age< 55

 Rule number: 4 [Kyphosis=absent cover=29 (36%) prob=0.00]
   Start>=8.5
   Start>=14.5

To get this output, I source the rattle/R/rpart.R source file (from the source package) in my workspace, after having removed the two calls to Rtxt() in the asRules.rpart() function (you can also replace it with print). Then, I just type

> asRules(fit)
chl
  • 50,972
  • 18
  • 205
  • 364
5

The rpart.plot package version 3.0 (July 2018) has a function rpart.rules for generating a set of rules for a tree. For example

library(rpart.plot)
fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
rpart.rules(fit)

gives

Kyphosis
    0.00 when Start >=      15
    0.00 when Start is 9 to 15 & Age <  55
    0.14 when Start is 9 to 15 & Age >=       111
    0.57 when Start is 9 to 15 & Age is 55 to 111
    0.58 when Start <  9

For more examples see Chapter 4 of the rpart.plot vignette.