From the ?rpart
documentation -
na.action : the default action deletes all observations for which y is missing, but keeps those in which one or more predictors are missing.
How does it impute missing values in predictors?
From the ?rpart
documentation -
na.action : the default action deletes all observations for which y is missing, but keeps those in which one or more predictors are missing.
How does it impute missing values in predictors?
This is where the surrogate variables come in - for each split, observations where the split variable is missing are split based on the best surrogate variable, if that's missing by the next best and so on, this is detailed in:
The document is accessible through rpart
help (pdf).