Best K in K-fold cross validation

Question

I'm using $k$-fold cross validation technique for generating train, test and validation indexes for a neural network. My sample size is 230~700. What is best $k$ for cross validation here. Now I'm using 10-fold cross validation but I think it is too high. What is your idea?

Have you tried K-K-fold cross validation to determine the best K? — AdamO, Sep 02 '14 at 16:58
No. I used it to have more reliable model (accuracy,sensitivity and specificity) for my classification problem. + this is cost function of an optimization algorithm and i need more reliable average cost. — user2991243, Sep 02 '14 at 17:00
I'm just kidding. It should just be enough to have confidence there's no uncertainty due to subsample choice. Traditional train-test validation is $k=1$, remember. $k$-fold "kicks in pretty quickly" as far as the $k$ is concerned, in my opinion. Double $k$-fold is not totally uncalled for if you HAVE to know, just do iterative split sample validation in your other $k$ to see how variable those model performance statistics are, but beware of small sample bias issues if you are getting very small $n$s there. — AdamO, Sep 02 '14 at 17:09
Oh. I didn't get that :-D . So what is your opinion for this sample size? Do you think 10-fold is good? + I'm using neural network and when I see the main page in MATLAB, In all neural network designs validation (maximum=6) is stopping the training. — user2991243, Sep 02 '14 at 17:11
Are you sure that's training? I think most software tends to feedback iterations in the backpropogation, not validation, 6 seems like the case for that. I haven't used matlab. 10fold is almost always fine regardless of sample size. If sample size *is* an issue, then you should be validating with a bootstrap instead! — AdamO, Sep 02 '14 at 17:19
Yes. The default early stopping of neural network toolbox sets to maximum 6 iterations. Thank you — user2991243, Sep 02 '14 at 17:38

score 0 · Answer 1 · answered Oct 30 '14 at 14:48

Actually there is no straight answer to the choice of K in k-fold cross validation. An higher k will give you more but smaller subsets on which run testing. An adopted choice is to select the K that gives you a testing set with the size of 15% of your total dataset.

However, other methods are also available; you may want to consider permutations or exahustive cross validation methods (more infos here).

Hope it helps.

Best K in K-fold cross validation

1 Answers1