0

In the celebrated Huber's robust estimation paper, he considered the following model $x_i \sim (1-\epsilon) P_\theta + \epsilon G$ where $P_\theta$ is assume to be standard normal. Under this model, data is contaminated by some unknown distribution $G$ with probability of $\epsilon$. The goal for the original paper is to estimate $\theta$ robust to the contamination from Q. Considering $G$ to be all possible probability density function, is the set of functions $(1-\epsilon)P_\theta + \epsilon G$ convex?

I understand that the space of all cdfs is convex so $G$ by itself is a convex set. What about this mixture of $P_\theta$ and $G$?

cccfran
  • 55
  • 5

1 Answers1

2

This space is indeed convex.

Using the definition for what is a convex set, for any $t \in [0,1]$,let $G_1,G_2$ be two probabilities, we want to know whether the following distribution is in Huber's contamination model: $$t ((1-\varepsilon)P_{\theta}+\varepsilon G_1)+(1-t)((1-\varepsilon)P_{\theta}+\varepsilon G_2).$$ This is equal to $$(1-\varepsilon)P_{\theta}+\varepsilon (tG_1+(1-t)G_2)$$ which is indeed again a corrupted Gaussian distribution with outlier probability $tG_1+(1-t)G_2$.

TMat
  • 716
  • 1
  • 10
  • Thanks! If this is a convex set, the solution to $\theta$ should be simple by treating $G$ as nuance. What is the significance of the robust estimator? Is that the inference for $\theta$? – cccfran Apr 19 '21 at 19:07
  • I don't understand your question. Could you reformulate ? – TMat Apr 19 '21 at 20:48
  • Sorry for the confusion. So the goal of the model is to estimate $\theta$. I understand that for a fixed $\theta$, this is a convex set. However, if we need to estimate $\theta$, i.e., for all $\theta \in \mathbb{R}$ and $G$ being all possible probability density function, is the set $(1-\epsilon) P_\theta + \epsilon G$ still convex? – cccfran Apr 28 '21 at 04:11
  • It depends of the set $M=\{P_\theta, \theta \in \mathbb{R}\}$. If $M$ is convex, then yes $(1-\varepsilon)P_\theta+\varepsilon G$ is convex (translation and dilatation of convex set). – TMat Apr 28 '21 at 10:42
  • I see. So if $M$ is the exponential family, then it should be convex. What if we also need to estimate all, i.e. $\epsilon$, $\theta$ and $G$? I felt like it becomes a mixture normal problem that is not convex any more. – cccfran May 26 '21 at 19:06
  • G is not fixed, it is whatever, so in fact if you allow any epsilon, $(1-\varepsilon)P_\theta+\varepsilon G$ can be any distribution. So this is the whole space, this is convex... Now, if you restrict to search only $\varepsilon <1/2$ then this becomes a very hard problem indeed. Without talking about convexity, It is not even identifiable (because G can be whatever). – TMat May 28 '21 at 13:32
  • Thanks again. Why restricting $\epsilon < 1/2$ introduces the identifiability issue? I thought it should not be a problem since $P_\theta$ has a parametric form (let's assume single mode such as the normal distribution). In fact, we need to discretize $G$ when estimating it, i.e., $G(x) = \sum_{j=1}^M w_j \delta_{x \in (\pi_{j-1}, \pi_j]}$ where $\pi_j, j=1,\ldots,M$ are the grid points. Would this introduce more issues? – cccfran May 28 '21 at 15:16
  • No you misunderstood what I said, restricting to $\varepsilon <1/2$ may make things identifiable (because then the majority of the information is about P). What you are asking when estimating $G, \varepsilon, P_\theta$ at the same time is very hard and already only estimating $\varepsilon, P_\theta$ is hard and most of the time $\varepsilon$ is supposed known in the litterature. I can't help you for such a difficult problem. – TMat May 29 '21 at 10:42