Questions tagged [stacking]

Stacking is a meta-ensemble machine learning technique that trains a second-level machine learning model on the predictions from multiple machine learning models trained on the data.

Stacking works by:

  1. Training a variety of machine learning models on the dataset
  2. Generating predictions from each of the trained models
  3. Training a second-level machine learning model (a meta-learner) on the predictions from step #2

Stacking apparently produces more accurate results than voting/averaging of ensemble predictions.

References:

89 questions
59
votes
7 answers

Industry vs Kaggle challenges. Is collecting more observations and having access to more variables more important than fancy modelling?

I'd hope the title is self explanatory. In Kaggle, most winners use stacking with sometimes hundreds of base models, to squeeze a few extra % of MSE, accuracy... In general, in your experience, how important is fancy modelling such as stacking vs…
Tom
  • 1,204
  • 8
  • 17
36
votes
2 answers

Is this the state of art regression methodology?

I've been following Kaggle competitions for a long time and I come to realize that many winning strategies involve using at least one of the "big threes": bagging, boosting and stacking. For regressions, rather than focusing on building one best…
16
votes
2 answers

Ensemble Learning: Why is Model Stacking Effective?

Recently, I've become interested in model stacking as a form of ensemble learning. In particular, I've experimented a bit with some toy datasets for regression problems. I've basically implemented individual "level 0" regressors, stored each…
kylerthecreator
  • 611
  • 1
  • 5
  • 11
12
votes
5 answers

Is automated machine learning a dream?

As I discover machine learning I see different interesting techniques such as: automatically tune algorithms with techniques such as grid search, get more accurate results through the combination of different algorithms of the same "type", that's…
8
votes
2 answers

Proper cross validation for stacking models

Lets assume that we have dataset that contains continuous variable $Y$ which we want to predict and 10 predictors $X_{1}, ..., X_{10}$. The number of observations is $n=1000$. I have questions about proper cross validation in two following…
Tomek Tarczynski
  • 3,854
  • 7
  • 29
  • 37
7
votes
1 answer

AIC model averaging when models are correlated

AIC model-averaging: In "standard" AIC model averaging we average models with weights proportional to $$w_i \propto \exp( -0.5 \times \Delta \text{AIC}_i ),$$ where $\Delta \text{AIC}_i$ is the difference of a models AIC to the best (in terms of…
Björn
  • 21,227
  • 2
  • 26
  • 65
7
votes
2 answers

Stacking without splitting data

I learned Stacking used in Ensemble learning. In Stacking, training data is split into two sets. The first set is used for training each model (layer-1, left figure), the second one is used for training of combiner of predictions (layer-2, right…
7
votes
1 answer

How to stack machine learning models in R

I am new to machine learning and R. I know that there is an R package called caretEnsemble, which could conveniently stack the models in R. However, this package looks has some problems when deals with multi-classes classification tasks.…
rz.He
  • 331
  • 2
  • 3
  • 7
6
votes
3 answers

Model Stacking - Gives poor performance

I'm trying model stacking in a kaggle competition. However, what the competition is trying to do is irrelevant. I think my approach of doing model stacking is not correct. I have 4 different models: xgboost model with dense features (numbers, that…
6
votes
1 answer

Stacking: Do more base classifiers always improve accuracy?

When using stacking, can accuracy always be improved by adding more base classifiers, types of base classifiers, and features?
6
votes
1 answer

stacking and blending of regression models

I am self-studying blending and stacking, and am especially interested in this in the context of regression models. I have been reading a number of the stacking, blending and bagging links posted on this forum, but have failed to find or (more…
user1885116
  • 2,128
  • 3
  • 23
  • 26
5
votes
2 answers

How to properly do stacking/meta ensembling with cross validation

How do people use stacking or meta ensembling with cross validation in practice and in machine learning competitions like on Kaggle? Here are two approaches I've seen (but maybe neither is correct?) Method1 (probably introduces a leak) splits: A B…
Ben
  • 1,612
  • 3
  • 17
  • 30
5
votes
1 answer

Isn't stacking models a direct approach to overfitting?

With help by the discussions here I successfully trained various models for classification. As an example say I trained a stochastic gradient boosted model (gbm) and an extreme gradient boosted tree (xgboost). They are trained using…
Richi W
  • 3,216
  • 3
  • 30
  • 53
4
votes
1 answer

Cross Validation in StackingClassifier Scikit-Learn

In Scikit-Learn StackingClassifier documentation it's written: Note that estimators_ are fitted on the full X while final_estimator_ is trained using cross-validated predictions of the base estimators using cross_val_predict. ... the default…
malioboro
  • 851
  • 1
  • 11
  • 19
4
votes
1 answer

Can we stack the strong learners?

Stacking is done with combining all the weak learners. What will happen if we do it with strong learners? A case of overfitting?
1
2 3 4 5 6