This post is related to an early post here published a few days ago, whereby I was having issues with prediction error rates; that is, the classification tree that I grew underperforms a naive intercept-only model, in which there is no predictors and one simply bets on the majority class of the zero-one coding of the binary response variable.
Then I went ahead and did cross-validation of the model, and the results below show that a singleton model is the best and has the least amount of prediction error. Does this align with the problem in the post that I was referring to previously? And how to resolve it? Thanks a lot!
> set.seed(47306)
> cv.h2 <- cv.tree(tree.h2, FUN=prune.misclass)
> cv.h2
$size
[1] 26 9 6 4 1
$dev
[1] 270 270 270 270 270
$k
[1] -Inf 0.00 1.00 2.50 2.67
$method
[1] "misclass"
attr(,"class")
[1] "prune" "tree.sequence"
> min.error = which.min(cv.h2$dev)
> min.error
[1] 1
> table(usedta[class.train,]$h2)
1poorHlth 0goodHlth
270 1305