Empirical Risk Minimization: why is the rate of convergence important?

Question

In section 4 Empirical Risk Minimzation of the paper Principles of Risk Minimization for Learning Theory by V. Vapnik, the author says the following:

In order to solve this problem, the following induction principle is proposed: the risk functional $R(w)$ is replaced by the empirical risk functional
$$E(w) = \dfrac{1}{\mathscr{l}} \sum_{i = 1}^\mathscr{l} L(y_i, f(x_i, w)) \tag{3}$$ constructed on the basis of the training set (1). The induction principle of empirical risk minimization (ERM) assumes that the function $f(x, w^*_\mathscr{l})$, which minimizes $E(w)$ over the set $w \in W$, results in a risk $R(w^*_\mathscr{l})$ which is close to its minimum.

This induction principle is quite general; many classical methods such as least square or maximum likelihood are realizations of the ERM principle.

The evaluation of the soundness of the ERM principle requires answers to the following two questions:

Is the principle consistent? (Does $R(w^*_\mathscr{l})$ converge to its minimum value on the set $w \in W$ when $\mathscr{l} \to \infty$?)

How fast is the convergence as $\mathscr{l}$ increases?

Why is the rate of convergence in 2. important?

score 0 · Accepted Answer · answered Nov 13 '20 at 02:47

Knowing that something converges simply means if we collect more and more data, eventually, we will get a better approximation eventually.

But how much data do we need so that we can conclude that our estimation error is at most certain amount with high probabilty?

The rate of convergence answer the question above.

Empirical Risk Minimization: why is the rate of convergence important?

1 Answers1