Stacking/ensembling models with caret

Question

I often find myself training several different predictive models using caret in R. I'll train them all on the same cross validation folds, using caret::: createFolds, then choose the best model based on cross-validated error.

However, the median prediction from several models often outperforms the best single model on an independent test set. I'm thinking of writing some functions for stacking/ensembling caret models that were trained with the same cross-validation folds, for example by taking median predictions from each model on each fold, or by training a "meta-model."

Of course, this might require an outer cross-validation loop. Does anyone know of any existing packages/open source code for ensembling caret models (and possibly cross-validating those ensembles)?

Zach · Accepted Answer · 2016-02-18T22:02:36.863

19

It looks like Max Kuhn actually started working on a package for ensembleling caret models, but hasn't had time to finish it yet. This is exactly what I was looking for. I hope the project gets finished one day!

edit: I wrote my own package to do this: caretEnsemble

edited Feb 18 '16 at 22:02

answered Oct 15 '12 at 19:08

Zach

22,308
18
114
158

1

Excellent work on this package! – mikeycgto Feb 24 '16 at 16:23

score 8 · Answer 2 · answered Sep 28 '12 at 08:43

8

What you are looking for is called "model ensembling". A simple introductory tutorial with R code can be found here: http://viksalgorithms.blogspot.jp/2012/01/intro-to-ensemble-learning-in-r.html

answered Sep 28 '12 at 08:43

thiakx

89
1
2

3

Not to be nit picky, but "ensembling" is right in the title of my post. I'm very specifically looking for an R package for ensembling arbitrary models, which doesn't seem to exist. Thanks for posting the code, though. Maybe I'll write my own package! – Zach Oct 15 '12 at 19:09

score 1 · Answer 3 · answered May 07 '12 at 04:13

1

I'm not quite sure what you are looking for but this might help: http://www.jstatsoft.org/v28/i05/paper

It is how to use multiple models in caret. The part you might be interested is section 5 on pg. 13.

answered May 07 '12 at 04:13

screechOwl

1,677
3
21
32

What I'm looking for is a package that would take as an input a list of caret objects, and would then output the median, mean, or weighted mean average of their predictions. More advanced functionality might include optimizing the weights through nested-cross validation. – Zach Oct 15 '12 at 19:11

Stacking/ensembling models with caret

3 Answers3