0

In Vapnik's Statistical Learning Theory (1998 edition) on pages 89-92, he proves a "key theorem of learning theory" that states the conditions for when:

"the following two statements are equivalent:

  1. For the given distribution function F(z), the empirical risk minimization method is strictly consistent on the set of functions Q(z, a), a in A.
  2. For the given distribution function F(z), the uniform one-sided convergence of the means to their mathematical expectation takes place over the set of functions Q(z, a)."

on page 92 he says that the "probability on the right-hand side tends to zero by the law of large numbers" for an inequality, but then for the second part of this half of the proof, he uses the uniform one-sided convergence. I'm confused why the first inequality doesn't need the uniform one-sided convergence, and just needs the law of large numbers.

Can anyone help me with this?

0 Answers0