2

tl;dr:

The main question is if I use an inequality that is true with a certain probability (confidence) twice, do I get the same confidence?

Original:

I've got the following exercise:

enter image description here

Where $e_p(h)$ denotes the true 0-1 loss of the hypothesis $h$ over the true distribution $P$, and where the labels/outputs are binary $\in \{0,1\}$.

Now, from (a) I've got that $e_p(h) \le e_s(h) +\sqrt{\frac{1}{2n}\ln\frac{2}{\delta w(h)}}$, and $e_s(h) \le e_p(h) +\sqrt{\frac{1}{2n}\ln\frac{2}{\delta w(h)}}$ where $e_s(h)$ is the sample (average) error. This was derived using Hoeffding's inequality, and is true with confidence $1-\delta$.

Now, to solve (b), I used the following:

$e_p(h_{srm}) - e_p(h^*) \le e_s(h_{srm}) + \sqrt{\frac{1}{2n}\ln\frac{2}{\delta w(h_{srm})}} - e_p(h^*) \\ \le e_s(h^*) + \sqrt{\frac{1}{2n}\ln\frac{2}{\delta w(h^*)}} - e_p(h^*) \le 2\sqrt{\frac{1}{2n}\ln\frac{2}{\delta w(h^*)}}$

I used (a) in the first move, definition of SRM in the 2nd, and then (a) again.

The problem, and my question is, if I used Hoeffding's inequality twice for two distinct hypothesis, doesn't it mean my confidence is no longer $1-\delta$ but rather $(1-\delta)^2$ ?

Maverick Meerkat
  • 2,147
  • 14
  • 27

1 Answers1

1

The typical way to handle such questions is via the union bound, which states that for any events $A$ and $B$ we have $P(A \cup B) \leq P(A) + P(B)$.

Let $A_{1}$ and $A_{2}$ be events denoting failure of the first and the second application of Hoeffding's inequality respectively. We have $P(A_{1}) \leq \delta$ and $P(A_{2}) \leq \delta$. Note that events $A_{1}$ and $A_{2}$ need not be independent.

Then, your result holds if neither $A_{1}$ nor $A_{2}$ happen (Hoeffding's inequality succeeds both times), which means $A_{1}^{c} \cap A_{2}^{c}$ happens.

We can proceed as follows: \begin{align*} P(A_{1}^{c} \cap A_{2}^{c}) &= P((A_{1} \cup A_{2})^{c}) \\ &= 1 - P(A_{1} \cup A_{2}) \\ &\geq 1 - P(A_{1}) - P(A_{2}) & \text{by the union bound} \\ &\geq 1 - 2\delta. \end{align*}

Hence, your result holds with probability at least $1 - 2\delta$.

user123098123
  • 406
  • 3
  • 4