7

For classification, what theoretical results are between cross-validation estimate of accuracy and generalisation accuracy?

I particularly asking about results in a PAC-like framework where no assumptions are made that your function class contains the "true" function.

I would love to know if there are theorems of the form: If your leave-one-out cross validation error-rate is $\theta$ on $N$ examples, then your generalisation error rate is lower than $\theta+\varepsilon$ with probability $f(\theta, \varepsilon, N)$.

If so, what are the general proof techniques to obtain them? What is the theoretical framework?

If a fully general theorem is impossible, what extra conditions, if any, allow you to arrive at this type of conclusion?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
luispedro
  • 740
  • 4
  • 17

1 Answers1

3

I don't know much about these kinds of proofs, but I think John Langford's thesis might be a good reference. Here's a relevant page: http://hunch.net/~jl/projects/prediction_bounds/prediction_bounds.html

and the probably relevant section of the thesis: http://hunch.net/~jl/projects/prediction_bounds/thesis/mathml/thesisse15.xml#x22-300004.1

IanS
  • 126
  • 2
  • 4