I'm learning about ensemble method (aka bootstrapping?). Specifically I'm using the statisticalModeling package but I'm not sure that matters. Here is some code:
library(statisticalModeling)
lm_mtcars <- lm(
mpg ~ cyl + hp,
data = mtcars
)
If I look at summary(lm_mtcars)
I see a residual standard error of 3.17.
I learned about the ensemble functions of the statisticalModeling()
package which generates nreps new models based on nreps bootstrap samples:
ensemble_lm_mtcars <- statisticalModeling::ensemble(lm_mtcars, nreps = 100, data = mtcars)
This ensemble_lm_mtcars variable appears to be made up of 5 parts, see screen shot of my console:
I understand what these are, I tested by typing them into the console and hitting enter. Presumably the "core" of this object is the replications.
I'm confused because I don't know what to do with this object now that I have created it. Presumably I can use it to try to improve my model accuracy, but how?
I Googled "Why use bootstrapping?" and serp page gave me this Wikipedia excerpt:
Bootstrapping allows assigning measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.
OK... how? For example, how can I use ensemble_lm_mtcars to improve prediction error?