Highest Voted 'kaggle' Questions - Statistical Analysis Stack Exchange

59

votes

7 answers

Industry vs Kaggle challenges. Is collecting more observations and having access to more variables more important than fancy modelling?

I'd hope the title is self explanatory. In Kaggle, most winners use stacking with sometimes hundreds of base models, to squeeze a few extra % of MSE, accuracy... In general, in your experience, how important is fancy modelling such as stacking vs…

asked Jul 10 '18 at 12:42

Tom

1,204
8
17

13

votes

2 answers

Are Kaggle competitions just won by chance?

Kaggle competitions determine final rankings based on a held-out test set. A held-out test set is a sample; it may not be representative of the population being modeled. Since each submission is like a hypothesis, the algorithm that won the…

machine-learning probability hypothesis-testing sample kaggle

asked Jul 19 '17 at 20:59

sjw

5,091
1
21
45

3

votes

2 answers

How to make train/test split with given class weights

I am doing simple multi class classification ML problem. I was given train data with perfectly balanced classes. However the data I must predict is not balanced. I was able to deduct the class proportions of test data. Is there a way to split…

python machine-learning kaggle

asked Aug 30 '19 at 09:01

Dmitry Petrov

31
3

1

vote

1 answer

What are the best practices for selecting your cross validation strategy?

I am new to Kaggle competitions and want to know if their are best practices for selecting a robust CV.

machine-learning neural-networks kaggle

asked Jan 07 '20 at 11:36

Kurtis Pykes

135
5

1

vote

1 answer

Cross validation best practice for competition purpose

I'm fairly new to DS scene and I have been learning about theories and doing practices on kaggle/participate in private competition. For real world problems, my understanding is that you split out test set from what you have, use training set for…

cross-validation boosting kaggle

asked Jul 01 '19 at 16:21

bchoiNY

13
3

1

vote

0 answers

What do they mean by Robust Cross-Validation?

I was reading a Kaggler Interview article and they kept specifying the importance of a stable and good cross-validation in order to win their competitions. What do they mean by that? I usually just use cross_val_score, and that's enough for me.

cross-validation robust kaggle

asked Dec 20 '18 at 20:27

Chipmunkafy

115
1
4

0

votes

1 answer

SVC doing great on validation & test data but scored very low on predicted data

First of all, this is my first machine learning project after taking Andrew Ng's course, so please bear with me. I'm working on the most famous dataset, the Titanic data. First, I split the dataset to training and testing set : training, testing =…

machine-learning cross-validation svm scoring-rules kaggle

asked Oct 12 '17 at 03:41

Blaze Tama

115
1
8

Questions tagged [kaggle]

Industry vs Kaggle challenges. Is collecting more observations and having access to more variables more important than fancy modelling?

Are Kaggle competitions just won by chance?

How to make train/test split with given class weights

What are the best practices for selecting your cross validation strategy?

Cross validation best practice for competition purpose

What do they mean by Robust Cross-Validation?

SVC doing great on validation & test data but scored very low on predicted data