3

Pruning is a technique in machine learning and search algorithms that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances.

I know what is decision trees and how it works. I am having trouble understanding how the pruning technique works and implemented.

Could anyone explain in simple words (and maybe with an example) how the pruning technique works and implemented in Decision Trees?

Pluviophile
  • 2,381
  • 8
  • 18
  • 45

1 Answers1

1

There are two main ways of pruning decision trees. pre pruning and post pruning. With pre pruning, you have basically also two ways of doing it:

  1. instead of continuing creating your tree until it fits perfectly to the given data you stop at any nodes separating it into several nodes when the number of samples within it is smaller than a certain threshold.
  2. when the number of samples within a node is smaller than the threshold you stop separating the node and you give a classification for the node based on the samples of the father of the node.

The second method takes care that all the classifications are based on at least a certain number of samples (threshold). With post pruning, you try recursively to cut from the tree any subtree of it starting from the leaves and you keep the empirically best pruning of the validation set.

Pluviophile
  • 2,381
  • 8
  • 18
  • 45
R.Gad
  • 26
  • 1