5

I am interested in the setting of differential privacy- let's say a random function $\mathcal{D}:X\to\mathbb{R}$ discriminates between (distinct) $x, y \in X$ in a differentially private way if $$ \mathbb{P}(\mathcal{D}(x) \in S) \le e^\epsilon \mathbb{P}(\mathcal{D}(y) \in S) $$ for all (Borel) subsets $S$ and also the reverse inequality holds ($\epsilon>0$ is fixed). (The prototypical example is $\mathcal{D}(x)=f(x) + $Laplace noise but I don't want the question to focus on that.)

This definition implies an upper bound on the most powerful test for $H_0:x=y$ vs. $H_A: x \neq y$ in terms of $\epsilon$ (see later in question for a naive bound). I am not familiar with the literature and would appreciate a reference for the best known bound.

In a concrete setting with Laplace noise I have estimated the max power below. For this question I am interested in general bounds with mild assumptions. On the other hand, the more work I look at this, the more I doubt much better general bounds are possible!


Edit [naive bound]: For the purposes of obtaining the bound N-P applies- after all we are just testing the distribution of $\mathcal{D}(x)$ against $\mathcal{D}(y)$.

Let's say I just test whether a single observation $o\in S$ is distributed according to $X$ ($H_0$) or $Y$ ($H_A$) and assume discreteness, for convenience of presentation. This is a N-P setting so the likelihood ratio gives the best test. $$ L(o):=\frac{\mathbb{P}(X=o)}{\mathbb{P}(Y=o)}. $$

Let the critical (rejection) region at $\alpha$ be $C$, i.e. $$ \alpha = \mathbb{P}(L(X) \in C) = \mathbb{P}(X \in L^{-1}(C)). $$ Now apply the DP definition with $S=L^{-1}(C)$ to get $$ \mathbb{P}(L(Y) \in C) = \mathbb{P}(Y \in L^{-1}(C)) \le \alpha e^\epsilon. $$

The bound derived this way for $n$ observations is $$ \mathbb{P}(L(Y_1,\ldots,Y_n) \in C) \le \alpha e^{n \epsilon}, $$
using hopefully obvious notation.

But for $n$ and $\epsilon$s I have seen real life $\alpha e^{n\epsilon} > 1$ so the bound is useless. However, I can't see how to easily improve it. I don't mind some extra assumptions on $\mathcal{D}$, e.g. regularity or discreteness.


Edit2 [Comparing naive bound to actual best for $n=1$ Laplace noise] I checked the $n=1$ test explicitly with Laplacian noise (assuming parameters are known and the median differs by exactly 1, a situation that satisfies DP with scale=$1/\epsilon$). When $\epsilon$ is small (i.e. the Laplace variance is large) the log likelihood ratio is basically delta at $\pm\epsilon$ and a critical region for $\alpha < e^{-2\epsilon}/2$ does not exist. When $\epsilon$ is large the Laplace distribution is highly concentrated around its median and the critical value is around $e^{-\epsilon}$ ~$0$ and test power is ~$.5$. For moderate $\epsilon$ the bound from naive application of the DP definition is good.

So in practice the best test power looks like below ($\alpha=0.05$ and we accept using a bigger critical region for small $\epsilon$):

power for Laplace noise test


Edit3 [Naive bound is awful as $n$ increases] As the number of observations increases the naive bound becomes lousier and lousier. By $n=10$ observations it is already useless. Below power is estimated by simulation and plotted against empirical alpha$\times exp(n\epsilon)$ because an exact $0.05$ critical region doesn't exist for small alpha and epsilon.

Naive bound lousy for $n>1$


Edit4 [Estimate max power for Laplace noise using CLT approximation to log LR]

For small - moderate $\epsilon$ (s.t. the Laplace noise is reasonably flat in a neighbourhood of the median) I approximated each term in the log likelihood ratio under the null hypothesis by $$ \ln(L_i) = \begin{cases} +\epsilon & \text{w.p.} 1/2 \\ -\epsilon & \text{w.p.} e^{-\epsilon}/2 \\ U[-\epsilon,\epsilon] & \text{otherwise} \\ \end{cases} $$ and going from null to alternative hypothesis amounts to simply switching the sign (using symmetry of the Laplace distribution). This just a sum of nice distributions and the central limit theorem has already kicked in strongly around $n=30$, i.e. the null distribution is approximately Gaussian with mean $n\epsilon (1 - e^{-\epsilon})/2$ and variance given by a hideous formula. The simulation below is with $n=30$.

enter image description here

Using this I can estimate the power for larger $n$ (plot below). This basically answers the question for the Laplace mechanism, but I'm still interested in what can be deduced from the DP definition.

enter image description here

P.Windridge
  • 2,075
  • 12
  • 13
  • 2
    Some recent work that derived the uniformly most powerful test for binomial data: https://journalprivacyconfidentiality.org/index.php/jpc/article/view/725 – Paul Nov 17 '20 at 04:19
  • Thanks Paul. Looks promising (but not read yet) – P.Windridge Nov 18 '20 at 21:57
  • 1
    See also Thm 2.4 of https://www.tandfonline.com/doi/abs/10.1198/jasa.2009.tm08651 which reproduces your naive bound. – James Jul 05 '21 at 04:59

1 Answers1

2

If I understand your question correctly, you're looking for a characterization of differential privacy in the formalism of hypothesis testing. This idea was formalized by Kairouz et al. in The Composition Theorem for Differential Privacy; see Section 2 and Theorem 2.1 in particular.

Ted
  • 308
  • 2
  • 10
  • Thanks Ted for the reply. I am not working on this now but skimmed the paper out of curiosity. Test power is $1 - $type II error, i.e. $1 - P_{MD}$ in the paper so my question could be rephrased "what is the type II error of the best test?", which Theorem 2.1 doesn't seem to answer - rather it's more like "differential privacy is equivalent to preventing both low type I *and* II error", and only tells me $P_{MD}\geq e^{-\epsilon}(1 - P_{FA})$ or power $\leq 1 - e^{-\epsilon}(1 - P_{FA})$. If I can stomach a high type I error this doesn't tell me much (and for my setting that was OK) – P.Windridge Apr 25 '20 at 09:44
  • Hope I didn't miss anything and +1 for the reference :) – P.Windridge Apr 25 '20 at 09:46