2

In adversarial settings, psuedo regret and not the actual regret is used. The explanation I have been given is that with actual regret the problem is no longer learnable (that is adversary can generate losses so that regret is no longer sub-linear).

But I don't see how adversary can do such a thing. Can you give a protocol for the adversary that will make it impossible to make regret sub-linear for any policy?

0 Answers0