Online convex optimization: Why use strongly convex regularizer for regularized-follow-the-leader instead of strictly convex (or just convex)

Question

I've read through Hazan's paper on online convex optimization. I don't quite understand why the regularization term must be strongly convex instead of more relaxed condition such as strictly convex.

https://ie.technion.ac.il/~ehazan/papers/shalom.pdf

For instance, we want to compute $$x = \text{argmin}_{x \in X} {f^Tx + R(x)}$$

Let $f \in \mathbb{R}^n, x \in \mathbb{R}^n$, so $f^Tx$ is a linear function, and $R(x)$ is the strongly convex regularization term, such as $\|x\|^2$. So $x$ will be unique, because the objective is strongly convex.

But what if $R(x)$ is simply strictly convex? It is a much more relaxed condition. The argument will be strictly convex, the minimizer again will be unique. What seem to be the issue with using strictly convex regularizers?

It's part of the proof that the algorithm can achieve logarithmic regret, not part of the proof that the algorithm will converge. — jbowman, Mar 25 '17 at 14:46

score 4 · Accepted Answer · answered Mar 29 '17 at 04:53

4

This is actually quite an interesting question. It's best to think of the RFTL algorithm as most naturally working with quadratic norms $R(x)=||x||_A^2$ for some positive definite matrix $A$. If you look at the proof, the support for other regularisers is handled by bounding the error by such a norm, where the matrix $A$ is the hessian of $R$ at some (unknown) point $z$. In order for the hessian to the positive definite everywhere (and hence induce a well-defined norm everywhere), $R$ needs to be strongly convex. In effect, strongly convex regularisers act approximately quadratic, at least locally.

Intuitively, a strictly convex function can be extremely close to being flat in some regions of the space. If such a regulariser was used, it would have virtually no effect, so basically the algorithm would behave like the non-regularised follow-the-leader.

answered Mar 29 '17 at 04:53

AaronDefazio

1,551
7
11

Aha, thanks for the answer. The problem seems to be strictly convex <= positive definite matrix, but not the other way. Also, for online linear optimization i.e. $f$ is a linear function, what can we say about the convexity of $f^Tx + R(x)$, where $R(x)$ is strongly convex in some norm $\|\cdot\|_p$? For instance $R(x) = xlog(x)$ is strongly convex in the 1 norm. Is $f^Tx + R(x)$ also strongly convex in $\|\cdot\|_p$? (I believe it is the case, but nothing to back it up short of a proof) – Fraïssé Mar 29 '17 at 05:23
In finite dimensions, strong convexity in one norm implies strong convexity in all other norms. The constant of strong convexity depends on the norm though. Note also that the function $x\log(x)$ is only technically strongly convex on bounded subsets of $R$. To answer your question, $f^Tx + R(x)$ is strongly when $R(x)$ is. – AaronDefazio Mar 29 '17 at 12:46

Online convex optimization: Why use strongly convex regularizer for regularized-follow-the-leader instead of strictly convex (or just convex)

1 Answers1