2

I am reading the book titled "An Introduction to Statistical Learning with Applications in R" by James et al. On page 326, we perform cross-validation to determine the optimal level of tree complexity (for a classification tree). Here, you can find an extract from the provided R-code. As you can notice one of the values of $k$ (which is actually the tuning parameter $\alpha$ for cost-complexity pruning) equals $-\infty$. I was wondering, how can one obtain $-\infty$ for the cost complexity parameter?

Thanks in advance.

Zachary
  • 135
  • 3
  • What library was used to fit this model? What does the documentation about that library say about this method? – Sycorax Mar 09 '20 at 19:03
  • @SycoraxsaysReinstateMonica: It's the [`tree`](https://cran.r-project.org/web/packages/tree/) package and unfortunately the documentation does not cover this point at all. – usεr11852 Mar 09 '20 at 19:17

1 Answers1

3

The value -Inf does not really relate to the cv.tree functionality but rather to the functionality around prune.tree/prune.misclass. Within those functions we examine a series of different complexity parameters $k$. These complexity parameters $k$ when starting their "grid-search" they are set to something ridiculously negative (small) so we can ensure that we do not penalise anything and we have the maximal tree possible. The real number is actually something like -1.0e+200. Then in a rather hammy way within prune.tree the first member of the sequence of $k$'s is set to minus infinity (literally: k[1L] <- -Inf).

usεr11852
  • 33,608
  • 2
  • 75
  • 117