Keywords

1 Introduction

Differential privacy is a framework that allows statistical analysis of private databases while minimizing the risks to individuals in the databases. The idea is that an individual should be relatively unaffected whether he or she decides to join or opt out of a research dataset. More specifically, the probability distribution of outputs of a statistical analysis of a database should be nearly identical to the distribution of outputs on the same database with a single person’s data removed. Here the probability space is over the coin flips of the randomized differentially private algorithm that handles the queries. To formalize this, we call two databases \(D_{0}, D_{1}\) with n rows each \(\textit{neighboring}\) if they are identical on at least \(n-1\) rows, and define differential privacy as follows:

Definition 1.1

(Differential Privacy [2, 3]). A randomized algorithm M is \((\epsilon ,\delta )\) -differentially private if for all pairs of neighboring databases \(D_{0}\) and \(D_{1}\) and all output sets \(S\subseteq \mathrm {Range}(M)\)

$$ \Pr [M(D_{0})\in S] \le e^{\epsilon }\Pr [M(D_{1})\in S] + \delta $$

where the probabilities are over the coin flips of the algorithm M.

In the practice of differential privacy, we generally think of \(\epsilon \) as a small, non-negligible, constant (e.g. \(\epsilon =.1\)). We view \(\delta \) as a “security parameter” that is cryptographically small (e.g. \(\delta =2^{-30}\)). One of the important properties of differential privacy is that if we run multiple distinct differentially private algorithms on the same database, the resulting composed algorithm is also differentially private, albeit with some degradation in the privacy parameters \((\epsilon ,\delta )\). In this paper, we are interested in quantifying the degradation of privacy under composition. We will denote the composition of k differentially private algorithms \(M_{1},M_{2},\ldots ,M_{k}\) as \((M_{1},M_{2},\ldots ,M_{k})\) where

$$ (M_{1},M_{2},\ldots ,M_{k})(x)=(M_{1}(x),M_{2}(x),\ldots ,M_{k}(x)). $$

A handful of composition theorems already exist in the literature. The first basic result says:

Theorem 1.2

(Basic Composition [2]). For every \(\epsilon \ge 0\), \(\delta \in [0,1]\), and \((\epsilon ,\delta )\)-differentially private algorithms \(M_{1},M_{2},\ldots ,M_{k}\), the composition \((M_{1},M_{2},\ldots ,M_{k})\) satisfies \((k\epsilon ,k\delta )\)-differential privacy.

This tells us that under composition, the privacy parameters of the individual algorithms “sum up,” so to speak. We care about understanding composition because in practice we rarely want to release only a single statistic about a dataset. Releasing many statistics may require running multiple differentially private algorithms on the same database. Composition is also a very useful tool in algorithm design. Often, new differentially private algorithms are created by combining several simpler algorithms. Composition theorems help us analyze the privacy properties of algorithms designed in this way.

Theorem 1.2 shows a linear degradation in global privacy as the number of algorithms in the composition (k) grows and it is of interest to improve on this bound. If we can prove that privacy degrades more slowly under composition, we can get more utility out of our algorithms under the same global privacy guarantees. Dwork, Rothblum, and Vadhan gave the following improvement on the basic summing composition above [5].

Theorem 1.3

(Advanced Composition [5]). For every \(\epsilon >0, \delta , \delta '>0,\) \(k\in \mathbb {N},\) and \((\epsilon ,\delta )\)-differentially private algorithms \(M_{1},M_{2},\ldots ,M_{k}\), the composition \((M_{1},M_{2},\ldots ,M_{k})\) satisfies \((\epsilon _{g},k\delta +\delta ')\)-differential privacy for

$$ \epsilon _{g} = \sqrt{2k\ln (1/\delta ')}\cdot \epsilon + k\cdot \epsilon \cdot (e^{\epsilon }-1)~. $$

Theorem 1.3 shows that privacy under composition degrades by a function of \(O(\sqrt{k\ln (1/\delta ')})\) which is an improvement if \(\delta '=2^{-O(k)}\). It can be shown that a degradation function of \(\varOmega (\sqrt{k\ln (1/\delta )})\) is necessary even for the simplest differentially private algorithms, such as randomized response [11].

Despite giving an asymptotically correct upper bound for the global privacy parameter, \(\epsilon _{g}\), Theorem 1.3 is not exact. We want an exact characterization because, beyond being theoretically interesting, constant factors in composition theorems can make a substantial difference in the practice of differential privacy. Furthermore, Theorem 1.3 only applies to “homogeneous” composition where each individual algorithm has the same pair of privacy parameters, \((\epsilon ,\delta )\). In practice we often want to analyze the more general case where some individual algorithms in the composition may offer more or less privacy than others. That is, given algorithms \(M_{1},M_{2},\ldots ,M_{k}\), we want to compute the best achievable privacy parameters for \((M_{1},M_{2},\ldots ,M_{k})\). Formally, we want to compute the function:

$$ \mathrm {OptComp}(M_{1},M_{2},\ldots ,M_{k},\delta _{g}) = \inf \{\epsilon _{g}:(M_{1},M_{2},\ldots ,M_{k}) ~\text {is}~ (\epsilon _{g},\delta _{g})\textsc {-DP}\}. $$

It is convenient for us to view \(\delta _{g}\) as given and then compute the best \(\epsilon _{g}\), but the dual formulation, viewing \(\epsilon _{g}\) as given, is equivalent (by binary search). Actually, we want a function that depends only on the privacy parameters of the individual algorithms:

$$\begin{aligned}&\mathrm {OptComp}((\epsilon _{1},\delta _{1}),(\epsilon _{2},\delta _{2}),\ldots ,(\epsilon _{k},\delta _{k}), \delta _{g}) =\\&\quad \sup \{\mathrm {OptComp}(M_{1},M_{2},\ldots ,M_{k},\delta _{g}):M_{i} ~\text {is}~ (\epsilon _{i},\delta _{i})\textsc {-DP} ~\forall i\in [k]\}. \end{aligned}$$

In other words we want \(\mathrm {OptComp}\) to give us the minimum possible \(\epsilon _{g}\) that maintains privacy for every sequence of algorithms with the given privacy parameters \((\epsilon _{i},\delta _{i})\). This definition refers to the case where the sequence of algorithms \((M_{1},\ldots ,M_{k})\) and the pair of neighboring databases \((D_{0},D_{1})\) on which they are applied are fixed, but we show that the same optimal bound holds even if the algorithms and databases are chosen adaptively, i.e. \(M_{i}\) and databases \((D_{0},D_{1})\) are chosen adaptively based on the outputs of \(M_{1},\ldots ,M_{i-1}\). (See Sect. 2 for a formal definition.)

A result from Kairouz, Oh, and Viswanath [9] characterizes \(\mathrm {OptComp}\) for the homogeneous case.

Theorem 1.4

(Optimal Homogeneous Composition [9]). For every \(\epsilon \ge 0\) and \(\delta \in [0,1)\), \(\mathrm {OptComp}((\epsilon ,\delta )_{1},(\epsilon ,\delta )_{2},\ldots ,(\epsilon ,\delta )_{k}, \delta _{g}) = (k-2i)\epsilon \), where i is the largest integer in \(\{0,1,\ldots ,\lfloor {k/2}\rfloor \}\) such that

$$ \frac{\sum \limits _{l=0}^{i-1}\genfrac(){0.0pt}0{k}{l}\left( e^{(k-l)\epsilon }-e^{(k-2i+l)\epsilon }\right) }{(1+e^{\epsilon })^{k}} \le 1-\frac{1-\delta _{g}}{(1-\delta )^{k}}~. $$

With this theorem the authors exactly characterize the composition behavior of differentially private algorithms with a polynomial-time computable solution. The problem remains to find the optimal composition behavior for the more general heterogeneous case. Kairouz, Oh, and Viswanath also provide an upper bound for heterogeneous composition that generalizes the \(O(\sqrt{k\ln (1/\delta ')})\) degradation found in Theorem 1.3 for homogeneous composition but do not comment on how close it is to optimal.

1.1 Our Results

We begin by extending the results of Kairouz, Oh, and Viswanath [9] to the general heterogeneous case.

Theorem 1.5

(Optimal Heterogeneous Composition). For all \(\epsilon _{1},\ldots ,\epsilon _{k} \ge 0\) and \(\delta _{1},\ldots ,\delta _{k},\delta _{g}\in [0,1), \mathrm {OptComp}((\epsilon _{1},\delta _{1}),(\epsilon _{2},\delta _{2}),\ldots ,(\epsilon _{k},\delta _{k}), \delta _{g})\) equals the least value of \(\epsilon _{g}\) such that

$$\begin{aligned} \frac{1}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} \le 1-\frac{1-\delta _{g}}{\prod _{i=1}^{k}{(1-\delta _{i})}}~. \end{aligned}$$
(1)

Theorem 1.5 exactly characterizes the optimal composition behavior for any arbitrary set of differentially private algorithms. It also shows that optimal composition can be computed in time exponential in k by computing the sum over \(S\subseteq \{1,\ldots ,k\}\) by brute force. Of course in practice an exponential-time algorithm is not satisfactory for large k. Our next result shows that this exponential complexity is necessary:

Theorem 1.6

Computing \(\mathrm {OptComp}\) is \(\#P\)-complete, even on instances where \(\delta _{1}=\delta _{2}=\ldots =\delta _{k}=0\) and \(\sum _{i\in [k]}\epsilon _{i}\le \epsilon \) for any desired constant \(\epsilon >0\).

Recall that \(\#P\) is the class of counting problems associated with decision problems in NP. So being \(\#P\)-complete means that there is no polynomial-time algorithm for \(\mathrm {OptComp}\) unless there is a polynomial-time algorithm for counting the number of satisfying assignments of boolean formulas (or equivalently for counting the number of solutions of all NP problems). So there is almost certainly no efficient algorithm for \(\mathrm {OptComp}\) and therefore no analytic solution. Despite the intractability of exact computation, we show that \(\mathrm {OptComp}\) can be approximated efficiently.

Theorem 1.7

There is a polynomial-time algorithm that given \(\epsilon _{1},\ldots ,\epsilon _{k}\ge 0, \delta _{1},\ldots \delta _{k}, \delta _{g} \in [0,1),\) and \(\eta >0\), outputs \(\epsilon ^{*}\) where

$$\begin{aligned} \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), \delta _{g})\le \epsilon ^{*}\le \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), e^{-\eta /2}\cdot \delta _{g})+\eta ~. \end{aligned}$$

The algorithm runs in \(O\left( \log \left( \frac{k}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \frac{k^{2}}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \) time assuming constant-time arithmetic operations.

Note that we incur a relative error of \(\eta \) in approximating \(\delta _{g}\) and an additive error of \(\eta \) in approximating \(\epsilon _{g}\). Since we always take \(\epsilon _{g}\) to be non-negligible or even constant, we get a very good approximation when \(\eta \) is polynomially small or even a constant. Thus, it is acceptable that the running time is polynomial in \(1/\eta \).

In addition to the results listed above, our proof of Theorem 1.5 also provides a somewhat simpler proof of the Kairouz-Oh-Viswanath homogeneous composition theorem (Theorem 1.4 [9]). The proof in [9] introduces a view of differential privacy through the lens of hypothesis testing and uses geometric arguments. Our proof relies only on elementary techniques commonly found in the differential privacy literature.

Practical Application. The theoretical results presented here were motivated by our work on an applied project called “Privacy Tools for Sharing Research Data”Footnote 1. We are building a system that will allow researchers with sensitive datasets to make differentially private statistics about their data available through data repositories using the DataverseFootnote 2 platform [1, 8]. Part of this system is a tool that helps both data depositors and data analysts distribute a global privacy budget across many statistics. Users select which statistics they would like to compute and are given estimates of how accurately each statistic can be computed. They can also redistribute their privacy budget according to which statistics they think are most valuable in their dataset. We implemented the approximation algorithm from Theorem 1.7 and integrated it with this tool to ensure that users get the most utility out of their privacy budget.

2 Technical Preliminaries

A useful notation for thinking about differential privacy is defined below.

Definition 2.1

For two discrete random variables Y and Z taking values in the same output space S, the \(\delta \) -approximate max-divergence of Y and Z is defined as:

$$ D_{\infty }^{\delta }(Y\Vert Z)\equiv \max _{S}\left[ \ln \frac{\Pr [Y\in S]-\delta }{\Pr [Z\in S]}\right] . $$

Notice that an algorithm M is \((\epsilon ,\delta )\) differentially private if and only if for all pairs of neighboring databases, \(D_{0},D_{1}\), we have \(D_{\infty }^{\delta }(M(D_{0})\Vert M(D_{1}))\le \epsilon \). The standard fact that differential privacy is closed under “post processing” [3, 4] now can be formulated as:

Fact 2.2

If \(f:S\rightarrow R\) is any randomized function, then

$$ D_{\infty }^{\delta }(f(Y)\Vert f(Z))\le D_{\infty }^{\delta }(Y\Vert Z). $$

Adaptive Composition. The composition results in our paper actually hold for a more general model of composition than the one described above. The model is called k-fold adaptive composition and was formalized in [5]. We generalize their formulation to the heterogeneous setting where privacy parameters may differ across different algorithms in the composition.

The idea is that instead of running k differentially private algorithms chosen all at once on a single database, we can imagine an adversary adaptively engaging in a “composition game.” The game takes as input a bit \(b\in \{0,1\}\) and privacy parameters \((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k})\). A randomized adversary A, tries to learn b through k rounds of interaction as follows: on the ith round of the game, A chooses an \((\epsilon _{i},\delta _{i})\)-differentially private algorithm \(M_{i}\) and two neighboring databases \(D_{(i,0)},D_{(i,1)}\). A then receives an output \(y_{i}\leftarrow M_{i}(D_{(i,b)})\) where the internal randomness of \(M_{i}\) is independent of the internal randomness of \(M_{1},\ldots ,M_{i-1}\). The choices of \(M_{i}, D_{(i,0)},\) and \(D_{(i,1)}\) may depend on \(y_{0},\ldots ,y_{i-1}\) as well as the adversary’s own randomness.

The outcome of this game is called the view of the adversary, \(V^{b}\) which is defined to be \((y_{1},\ldots ,y_{k})\) along with A’s coin tosses. The algorithms \(M_{i}\) and databases \(D_{(i,0)},D_{(i,1)}\) from each round can be reconstructed from \(V^{b}\). Now we can formally define privacy guarantees under k-fold adaptive composition.

Definition 2.3

We say that the sequences of privacy parameters \(\epsilon _{1},\ldots ,\epsilon _{k}\ge 0, \delta _{1},\ldots ,\delta _{k}\in [0,1)\) satisfy \((\epsilon _{g},\delta _{g})\)-differential privacy under adaptive composition if for every adversary A we have \(D_{\infty }^{\delta _{g}}(V^{0}\Vert V^{1})\le \epsilon _{g}\), where \(V^{b}\) represents the view of A in composition game b with privacy parameter inputs \((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k})\).

Computing Real-Valued Functions. Many of the computations we discuss involve irrational numbers and we need to be explicit about how we model such computations on finite, discrete machines. Namely when we talk about computing a function \(f:\{0,1\}^{*}\rightarrow \mathbb {R}\), what we really mean is computing f to any desired number q bits of precision. More precisely, given xq, the task is to compute a number \(y\in \mathbb {Q}\) such that \(\left| f(x)-y\right| \le \frac{1}{2^{q}}\). We measure the complexity of algorithms for this task as a function of \(|x|+q\).

3 Characterization of OptComp

Following [9], we show that to analyze the composition of arbitrary \((\epsilon _{i},\delta _{i})\)-DP algorithms, it suffices to analyze the composition of the following simple variant of randomized response [11].

Definition 3.1

([9]). Define a randomized algorithm \(\tilde{M}_{(\epsilon ,\delta )}:\{0,1\} \rightarrow \{0,1,2,3\}\) as follows, setting \(\alpha =1-\delta \):

Note that \(\tilde{M}_{(\epsilon ,\delta )}\) is in fact \((\epsilon ,\delta )\)-DP. Kairouz, Oh, and Viswanath showed that \(\tilde{M}_{(\epsilon ,\delta )}\) can be used to simulate the output of every \((\epsilon ,\delta )\)-DP algorithm on adjacent databases.

Lemma 3.2

([9]). For every \((\epsilon ,\delta )\)-DP algorithm M and neighboring databases \(D_{0},D_{1}\), there exists a randomized algorithm T such that \(T(\tilde{M}_{(\epsilon ,\delta )}(b))\) is identically distributed to \(M(D_{b})\) for \(b=0\) and \(b=1\).

Proof

We provide a new proof of this lemma in the full version of the paper [10].

Since \(\tilde{M}_{(\epsilon ,\delta )}\) can simulate any \((\epsilon ,\delta )\) differentially private algorithm and it is known that post-processing preserves differential privacy (Fact 2.2), it follows that to analyze the composition of arbitrary differentially private algorithms, it suffices to analyze the composition of \(\tilde{M}_{(\epsilon _{i},\delta _{i})}\)’s:

Lemma 3.3

For all \(\epsilon _{1},\ldots ,\epsilon _{k}\ge 0, \delta _{1},\ldots ,\delta _{k},\delta _{g}\in [0,1)\),

$$ \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), \delta _{g})= \mathrm {OptComp}(\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})},\delta _{g}). $$

Proof

Since \(\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}\) are \((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k})\)-differentially private, we have:

$$\begin{aligned}&\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), \delta _{g})&\\&\quad =\quad \sup \{\mathrm {OptComp}(M_{1},\ldots ,M_{k},\delta _{g}):M_{i} ~\text {is}~ (\epsilon _{i},\delta _{i})\textsc {-DP} ~\forall i\in [k]\}\\&\quad \ge \quad \mathrm {OptComp}(\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})},\delta _{g})~. \end{aligned}$$

For the other direction, it suffices to show that for every \(M_{1},\ldots ,M_{k}\) that are \((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k})\)-differentially private, we have

$$\begin{aligned} \mathrm {OptComp}(M_{1},\ldots ,M_{k},\delta _{g})\le \mathrm {OptComp}(\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})})~. \end{aligned}$$

That is,

$$ \inf \{\epsilon _{g}:(M_{1},\ldots ,M_{k}) ~\text {is}~ (\epsilon _{g},\delta _{g})\textsc {-DP}\}\le \inf \{\epsilon _{g}:(\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}) ~\text {is}~ (\epsilon _{g},\delta _{g})\textsc {-DP}\}. $$

So suppose \((\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})})\) is \((\epsilon _{g},\delta _{g})\)-DP. We will show that \((M_{1},\ldots ,M_{k})\) is also \((\epsilon _{g},\delta _{g})\)-DP. Taking the infimum over \(\epsilon _{g}\) then completes the proof.

We know from Lemma 3.2 that for every pair of neighboring databases \(D_{0},D_{1}\), there must exist randomized algorithms \(T_{1},\ldots ,T_{k}\) such that \(T_{i}(\tilde{M}_{(\epsilon _{i},\delta _{i})}(b))\) is identically distributed to \(M_{i}(D_{b})\) for all \(i\in \{1,\ldots ,k\}\). By hypothesis we have

$$D_{\infty }^{\delta _{g}}\left( (\tilde{M}_{(\epsilon _{1},\delta _{1})}(0),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(0))\Vert (\tilde{M}_{(\epsilon _{1},\delta _{1})}(1),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(1))\right) \le \epsilon _{g}~. $$

Thus by Fact 2.2 we have:

$$\begin{aligned}&D_{\infty }^{\delta _{g}}\big ((M_{1}(D_{0}),\ldots ,M_{k}(D_{0}))\Vert (M_{1}(D_{1}),\ldots ,M_{k}(D_{1}))\big )=\\&\quad D_{\infty }^{\delta _{g}}\left( (T_{1}(\tilde{M}_{(\epsilon _{1},\delta _{1})}(0)),\ldots ,T_{k}(\tilde{M}_{(\epsilon _{k},\delta _{k})}(0)))\Vert (T_{1}(\tilde{M}_{(\epsilon _{1},\delta _{1})}(1)),\ldots ,T_{k}(\tilde{M}_{(\epsilon _{k},\delta _{k})}(1)))\right) \le \epsilon _{g}. \end{aligned}$$

Now we are ready to characterize \(\mathrm {OptComp}\) for an arbitrary set of differentially private algorithms.

Proof

(Proof of Theorem 1.5 ). Given \((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k})\) and \(\delta _{g}\), let \(\tilde{M}^{k}(b)\) denote the composition \((\tilde{M}_{(\epsilon _{1},\delta _{1})}(b),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(b))\) and let \(\tilde{P}_{b}^{k}(x)\) be the probability mass function of \(\tilde{M}^{k}(b)\), for \(b=0\) and \(b=1\). By Lemma 3.3, \(\mathrm {OptComp((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),\delta _{g})}\) is the smallest value of \(\epsilon _{g}\) such that:

$$ \delta _{g} \ge \max _{Q\subseteq \{0,1,2,3\}^{k}}\{\tilde{P}_{0}^{k}(Q)-e^{\epsilon _{g}}\cdot \tilde{P}_{1}^{k}(Q)\}. $$

Given \(\epsilon _{g}\), the set \(S\subseteq \{0,1,2,3\}^{k}\) that maximizes the right-hand side is

$$ S=S(\epsilon _{g})=\left\{ x\in \{0,1,2,3\}^{k}\mid \tilde{P}_{0}^{k}(x)\ge e^{\epsilon _{g}}\cdot \tilde{P}_{1}^{k}(x)\right\} . $$

We can further split \(S(\epsilon _{g})\) into \(S(\epsilon _{g}) = S_{0}(\epsilon _{g})\cup S_{1}(\epsilon _{g})\) with

$$\begin{aligned} S_{0}(\epsilon _{g})&= \left\{ x\in \{0,1,2,3\}^{k} \mid \tilde{P}_{1}^{k}(x) = 0\right\} . \\ S_{1}(\epsilon _{g})&= \left\{ x\in \{0,1,2,3\}^{k} \mid \tilde{P}_{0}^{k}(x) \ge e^{\epsilon _{g}}\cdot \tilde{P}_{1}^{k}(x), ~\text {and}~ \tilde{P}_{1}^{k}(x) > 0\right\} . \end{aligned}$$

Note that \(S_{0}(\epsilon _{g})\cap S_{1}(\epsilon _{g})=\emptyset \). We have \(\tilde{P}_{1}^{k}(S_{0}(\epsilon _{g}))=0\) and \(\tilde{P}_{0}^{k}(S_{0}(\epsilon _{g})) = 1-\Pr [\tilde{M}^{k}(0)\in \{1,2,3\}^{k}] = 1-\prod _{i=1}^{k}(1-\delta _{i})\). So

$$\begin{aligned} \tilde{P}_{0}^{k}(S(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S(\epsilon _{g}))&= \tilde{P}_{0}^{k}(S_{0}(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S_{0}(\epsilon _{g})) + \tilde{P}_{0}^{k}(S_{1}(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S_{1}(\epsilon _{g})) \\&= 1-\prod _{i=1}^{k}(1-\delta _{i})^{k} + \tilde{P}_{0}^{k}(S_{1}(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S_{1}(\epsilon _{g})). \end{aligned}$$

Now we just need to analyze \(\tilde{P}_{0}^{k}(S_{1}(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S_{1}(\epsilon _{g}))\). Notice that \(S_{1}(\epsilon _{g})\subseteq \{1,2\}^{k}\) because for all \(x\in S_{1}(\epsilon _{g})\), we have \(\tilde{P}_{0}(x)>\tilde{P}_{1}(x)>0\). So we can write:

$$\begin{aligned} \tilde{P}_{0}^{k}(S_{1}(\epsilon _{g}))&-e^{\epsilon _{g}}\cdot \tilde{P}_{1}^{k}(S_{1}(\epsilon _{g}))\\&= \sum _{y\in \{1,2\}^{k}}\max \left\{ \prod _{i:y_{i}=1}\frac{(1-\delta _{i})e^{\epsilon _{i}}}{1+e^{\epsilon _{i}}}\cdot \prod _{i:y_{i}=2}\frac{(1-\delta _{i})}{1+e^{\epsilon _{i}}} -\right. \\&\qquad \left. e^{\epsilon _{g}}\prod _{i:y_{i}=1}\frac{(1-\delta _{i})}{1+e^{\epsilon _{i}}}\cdot \prod _{i:y_{i}=2}\frac{(1-\delta _{i})e^{\epsilon _{i}}}{1+e^{\epsilon _{i}}},0\right\} \\&=\prod _{i=1}^{k}\frac{1-\delta _{i}}{1+e^{\epsilon _{i}}}\sum _{y\in \{0,1\}^{k}}\max \left\{ \frac{e^{\sum _{i=1}^{k}{{\epsilon _{i}}}}}{e^{\sum _{i=1}^{k}y_{i}\epsilon _{i}}} - e^{\epsilon _{g}}\cdot e^{\sum _{i=1}^{k}y_{i}\epsilon _{i}}, 0\right\} . \end{aligned}$$

Putting everything together yields:

$$\begin{aligned} \delta _{g}&\ge \tilde{P}_{0}^{k}(S_{0}(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S_{0}(\epsilon _{g})) + \tilde{P}_{0}^{k}(S_{1}(\epsilon _{g})) - e^{\epsilon _{g}}\tilde{P}_{1}^{k}(S_{1}(\epsilon _{g})) \\&= 1-\prod _{i=1}^{k}(1-\delta _{i}) + \frac{\prod _{i=1}^{k}(1-\delta _{i})}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} . \end{aligned}$$

We have characterized the optimal composition for an arbitrary set of differentially private algorithms \((M_{1},\ldots ,M_{k})\) under the assumption that the algorithms are chosen in advance and all run on the same database. Next we show that \(\mathrm {OptComp}\) under this restrictive model of composition is actually equivalent under the more general k-fold adaptive composition discussed in Sect. 2.

Theorem 3.4

The privacy parameters \(\epsilon _{1},\ldots ,\epsilon _{k}\ge 0, \delta _{1},\ldots ,\delta _{k}\in [0,1)\), satisfy \((\epsilon _{g},\delta _{g})\)-differential privacy under adaptive composition if and only if \(\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),\delta _{g})\le \epsilon _{g}\).

Proof

First suppose the privacy parameters \(\epsilon _{1},\ldots ,\epsilon _{k},\delta _{1},\ldots ,\delta _{k}\) satisfy \((\epsilon _{g},\delta _{g})\)-differential privacy under adaptive composition. Then \(\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),\delta _{g})\le \epsilon _{g}\) because adaptive composition is more general than the composition defining \(\mathrm {OptComp}\).

Conversely, suppose \(\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),\delta _{g})\le \epsilon _{g}\). In particular, this means \(\mathrm {OptComp}(\tilde{M}_{(\epsilon _{1},\delta _{1})},\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})},\delta _{g})\le \epsilon _{g}\). To complete the proof, we must show that the privacy parameters \(\epsilon _{1},\ldots ,\epsilon _{k},\delta _{1},\ldots ,\delta _{k}\) satisfy \((\epsilon _{g},\delta _{g})\)-differential privacy under adaptive composition.

Fix an adversary A. On each round i, A uses its coin tosses r and the previous outputs \(y_{1},\ldots ,y_{i-1}\) to select an \((\epsilon _{i},\delta _{i})\)-differentially private algorithm \(M_{i}=M_{i}^{r,y_{1},\ldots ,y_{i-1}}\) and neighboring databases \(D_{0}=D_{0}^{r,y_{1},\ldots ,y_{i-1}},D_{1}=D_{1}^{r,y_{1},\ldots ,y_{i-1}}\). Let \(V^{b}\) be the view of A with the given privacy parameters under composition game b for \(b=0\) and \(b=1\).

Lemma 3.2 tells us that there exists an algorithm \(T_{i}=T_{i}^{r,y_{1},\ldots ,y_{i-1}}\) such that \(T_{i}(\tilde{M}_{(\epsilon _{i},\delta _{i})}(b))\) is identically distributed to \(M_{i}(D_{b})\) for both \(b=0,1\) for all \(i\in [k]\). Define \(\hat{T}(z_{1},\ldots ,z_{k})\) for \(z_{1},\ldots ,z_{k}\in \{0,1,2,3\}\) as follows:

  1. 1.

    Randomly choose coins r for A

  2. 2.

    For \(i=1,\ldots ,k,\) let \(y_{i}\leftarrow T_{i}^{r,y_{1},\ldots ,y_{i-1}}(z_{i})\)

  3. 3.

    Output \((r,y_{1},\ldots ,y_{k})\)

Notice that \(\hat{T}(\tilde{M}_{(\epsilon _{1},\delta _{1})}(b),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(b))\) is identically distributed to \(V^{b}\) for both \(b=0,1\). By hypothesis we have

$$D_{\infty }^{\delta _{g}}\left( (\tilde{M}_{(\epsilon _{1},\delta _{1})}(0),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(0))\Vert (\tilde{M}_{(\epsilon _{1},\delta _{1})}(1),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(1))\right) \le \epsilon _{g}. $$

Thus by Fact 2.2 we have:

$$ D_{\infty }^{\delta _{g}}\big (V^{0}\Vert V^{1}\big )=D_{\infty }^{\delta _{g}}\left( \hat{T}(\tilde{M}_{(\epsilon _{1},\delta _{1})}(0),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(0))\Vert \hat{T}(\tilde{M}_{(\epsilon _{1},\delta _{1})}(1),\ldots ,\tilde{M}_{(\epsilon _{k},\delta _{k})}(1))\right) \le \epsilon _{g}. $$

4 Hardness of OptComp

\(\#P\) is the class of all counting problems associated with decision problems in NP. It is a set of functions that count the number of solutions to some NP problem. More formally:

Definition 4.1

A function \(f:\{0,1\}^{*}\rightarrow \mathbb {N}\) is in the class \(\#P\) if there exists a polynomial \(p:\mathbb {N}\rightarrow \mathbb {N}\) and a polynomial time algorithm M such that for every \(x\in \{0,1\}^{*}\):

$$ f(x)=\left| \left\{ y\in \{0,1\}^{p(|x|)}:M(x,y)=1\right\} \right| . $$

Definition 4.2

A function g is called \(\#P\) -hard if every function \(f\in \#P\) can be computed in polynomial time given oracle access to g. That is, evaluations of g can be done in one time step.

If a function is \(\#P\)-hard, then there is no polynomial-time algorithm for computing it unless there is a polynomial-time algorithm for counting the number of solutions of all NP problems.

Definition 4.3

A function f is called \(\#P\) -easy if there is some function \(g\in \#P\) such that f can be computed in polynomial time given oracle access to g.

If a function is both \(\#P\)-hard and \(\#P\)-easy, we say it is \(\#P\)-complete. Proving that computing \(\text {OptComp}\) is \(\#P\)-complete can be broken into two steps: showing that it is \(\#P\)-easy and showing that it is \(\#P\)-hard.

Lemma 4.4

Computing \(\text {OptComp}\) is \(\#P\)-easy.

Proof

A proof of this statement can be found in the full version of the paper [10].

Next we show that computing \(\text {OptComp}\) is also \(\#P\)-hard through a series of reductions. We start with a multiplicative version of the partition problem that is known to be \(\#P\)-complete by Ehrgott [7]. The problems in the chain of reductions are defined below.

Definition 4.5

\(\#\textsc {INT-PARTITION}\) is the following problem: given a set \(Z = \{z_{1}, z_{2}, \ldots , z_{k}\}\) of positive integers, count the number of partitions \(P\subseteq [k]\) such that

$$\begin{aligned} \prod _{i\in P}z_{i} - \prod _{i\not \in P}z_{i} = 0~. \end{aligned}$$

All of the remaining problems in our chain of reductions take inputs \(\{w_{1},\ldots ,w_{k}\}\) where \(1\le w_{i}\le e\) is the Dth root of a positive integer for all \(i\in [k]\) and some positive integer D. All of the reductions we present hold for every positive integer D, including \(D=1\) when the inputs are integers. The reason we choose D to be large enough such that our inputs are in the range [1, e] is because in the final reduction to \(\text {OptComp}\), \(\epsilon _{i}\) values in the proof are set to \(\ln (w_{i})\). We want to show that our reductions hold for reasonable values of \(\epsilon \)’s in a differential privacy setting so throughout the proofs we use \(w_{i}\)’s \(\in [1,e]\) to correspond to \(\epsilon _{i}\)’s \(\in [0,1]\) in the final reduction. It is important to note though that the reductions still hold for any choice of positive integer D and thus any range of \(\epsilon \)’s \(\ge 0\).

Definition 4.6

\(\#\textsc {PARTITION}\) is the following problem: given a number \(D\in \mathbb {N}\) and a set \(W = \{w_{1}, w_{2}, \ldots , w_{k}\}\) of real numbers where for all \(i\in [k]\), \(1\le w_{i}\le e\) is the Dth root of a positive integer, count the number of partitions \(P\subseteq [k]\) such that

$$\begin{aligned} \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i} = 0. \end{aligned}$$

Definition 4.7

\(\#\textsc {T-PARTITION}\) is the following problem: given a number \(D\in \mathbb {N}\) and a set \(W = \{w_{1}, w_{2}, \ldots , w_{k}\}\) of real numbers where for all \(i\in [k]\), \(1\le w_{i}\le e\) is the Dth root of a positive integer and a positive real number T, count the number of partitions \(P\subseteq [k]\) such that

$$\begin{aligned} \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i} = T. \end{aligned}$$

Definition 4.8

\(\#\textsc {SUM-PARTITION}\): given a number \(D\in \mathbb {N}\) and a set \(W = \{w_{1}, w_{2}, \ldots , w_{k}\}\) of real numbers where for all \(i\in [k]\), \(1\le w_{i}\le e\) is the Dth root of a positive integer and a real number \(r>1\), find

$$ \sum _{P\subseteq [k]}\max \left\{ \prod _{i\in P}w_{i} - r\cdot \prod _{i\not \in P}w_{i},0\right\} . $$

We prove that computing \(\text {OptComp}\) is \(\#P\)-hard by the following series of reductions:

$$ \#\textsc {INT-PARTITION} \le \#\textsc {PARTITION} \le \#\textsc {T-PARTITION} \le \#\textsc {SUM-PARTITION} \le \mathrm {OptComp}. $$

Since \(\#\textsc {INT-PARTITION}\) is known to be \(\#P\)-complete [7], the chain of reductions will prove that \(\mathrm {OptComp}\) is \(\#P\)-hard.

Lemma 4.9

For every constant \(c>1\), \(\#\textsc {PARTITION}\) is \(\#P\)-hard, even on instances where \(\prod _{i}w_{i}\le c\).

Proof

Given an instance of \(\#\textsc {INT-PARTITION}\), \(\{z_{1},\ldots ,z_{k}\}\), we show how to find the solution in polynomial time using a \(\#\textsc {PARTITION}\) oracle. Set \(D=\lceil {\log _{c}(\prod _{i}z_{i})}\rceil \) and \(w_{i}=\root D \of {z_{i}} ~\forall i\in [k]\). Note that \(\prod _{i}w_{i}=\left( \prod _{i}z_{i}\right) ^{1/D}\le c\). Let \(P\subseteq [k]\):

$$\begin{aligned} \prod _{i\in P}w_{i} = \prod _{i\not \in P}w_{i}&\iff \left( \prod _{i\in P}w_{i}\right) ^{D} = \left( \prod _{i\not \in P}w_{i}\right) ^{D} \\&\iff \prod _{i\in P}z_{i} = \prod _{i\not \in P}z_{i}~. \end{aligned}$$

There is a one-to-one correspondence between solutions to the \(\#\textsc {PARTITION}\) problem and solutions to the given \(\#\textsc {INT-PARTITION}\) instance. We can solve \(\#\textsc {INT-PARTITION}\) in polynomial time with a \(\#\textsc {PARTITION}\) oracle. Therefore \(\#\textsc {PARTITION}\) is \(\#P\)-hard.

Lemma 4.10

For every constant \(c>1\), \(\#\textsc {T-PARTITION}\) is \(\#P\)-hard, even on instances where \(\prod _{i}w_{i}\le c\).

Proof

Let \(c>1\) be a constant. We will reduce from \(\#\textsc {PARTITION}\), so consider an instance of the \(\#\textsc {PARTITION}\) problem, \(W = \{w_{1}, w_{2}, \ldots , w_{k}\}\). We may assume \(\prod _{i}w_{i}\le \sqrt{c}\) since \(\sqrt{c}\) is also a constant greater than 1.

Set \(W'=W\cup \{w_{k+1}\}\), where \(w_{k+1}=\prod _{i=1}^{k}w_{i}\). Notice that \(\prod _{i=1}^{k+1}w_{i}\le (\sqrt{c})^{2}=c\). Set \(T = \sqrt{w_{k+1}}\left( w_{k+1}-1\right) \). Now we can use a \(\#\textsc {T-PARTITION}\) oracle to count the number of partitions \(Q\subseteq \{1,\ldots ,k+1\}\) such that

$$\begin{aligned} \prod _{i\in Q}w_{i} - \prod _{i\not \in Q}w_{i} = T~. \end{aligned}$$

Let \(P=Q\cap \{1,\ldots ,k\}\). We will argue that \(\prod _{i\in Q}w_{i}-\prod _{i\not \in Q}w_{i}=T\) if and only if \(\prod _{i\in P}w_{i}=\prod _{i\not \in P}w_{i}\), which completes the proof. There are two cases to consider: \(w_{k+1}\in Q\) and \(w_{k+1}\not \in Q\).

Case 1: \(w_{k+1}\in Q\). In this case, we have:

$$\begin{aligned}&w_{k+1}\cdot \left( \prod _{i\in P}w_{i}\right) - \prod _{i\not \in P}w_{i} = \prod _{i\in Q}w_{i}-\prod _{i\not \in Q}w_{i}= T = \sqrt{w_{k+1}}\left( w_{k+1}-1\right) \\&\iff \left( \prod _{i\in [k]}w_{i}\right) \!\!\!\left( \prod _{i\in P}w_{i}\right) ^{2} \!\!\!-\!\!\! \prod _{i\in [k]}w_{i}=\sqrt{\prod _{i\in [k]}w_{i}}\left( \prod _{i\in [k]}w_{i}-1\right) \!\!\!\left( \prod _{i\in P}w_{i}\right) ~\,\text {multiplied both sides by}\,\!\!\prod \limits _{i\in P}w_{i}\\&\iff \left( \prod _{i\in P}w_{i}-\sqrt{\prod _{i\in [k]}w_{i}}\right) \left( \prod _{i\in [k]}w_{i}\prod _{i\in P}w_{i} + \sqrt{\prod _{i\in [k]}w_{i}}\right) =0 ~~~~~~~~~~~~~~~\text {factored quadratic in} \prod \limits _{i\in P}w_{i} \\&\iff \prod _{i\in P}w_{i}=\sqrt{\prod _{i\in [k]}w_{i}}\\&\iff \prod _{i\not \in P}w_{i}=\prod _{i\in P}w_{i}~. \end{aligned}$$

So there is a one-to-one correspondence between solutions to the \(\#\textsc {T-PARTITION}\) instance \(W'\) where \(w_{k+1}\in Q\) and solutions to the original \(\#\textsc {PARTITION}\) instance W.

Case 2: \(w_{k+1}\not \in Q\). Solutions now look like:

$$ \prod _{i\in P}w_{i} - \prod _{i\in [k]}w_{i}\prod _{i\not \in P}w_{i} = \sqrt{\prod _{i\in [k]}w_{i}}\left( \prod _{i\in [k]}w_{i}-1\right) . $$

One way this can be true is if \(w_{i}=1\) for all \(i\in [k]\). We can check ahead of time if our input set W contains all ones. If it does, then there are \(2^{k}-2\) partitions that yield equal products (all except \(P=[k]\) and \(P=\emptyset \)) so we can just output \(2^{k}-2\) as the solution and not even use our oracle. The only other way to satisfy the above expression is for \(\prod _{i\in P}w_{i} > \prod _{i\in [k]}w_{i}\) which cannot happen because \(P\subseteq [k]\). So there are no solutions in the case that \(w_{k+1}\not \in Q\).

Therefore the output of the \(\#\textsc {T-PARTITION}\) oracle on \(W'\) is the solution to the \(\#\textsc {PARTITION}\) problem. So \(\#\textsc {T-PARTITION}\) is \(\#P\)-hard.

Lemma 4.11

For every constant \(c>1\), \(\#\textsc {SUM-PARTITION}\) is \(\#P\)-hard even on instances where \(\prod _{i}w_{i}\le c\).

Proof

We will use a \(\#\textsc {SUM-PARTITION}\) oracle to solve \(\#\textsc {T-PARTITION}\) given a set of Dth roots of positive integers \(W=\{w_{1}, \ldots , w_{k}\}\) and a positive real number T. Notice that for every \(z>0\):

$$\begin{aligned} \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i}=z&\implies \prod _{i\in P}w_{i} - \frac{\prod _{i\in [k]}w_{i}}{\prod _{i\in P}w_{i}} = z \\&\implies \exists ~ j\in \mathbb {Z}^{+} \mathrm {such~that} \root D \of {j}-\frac{\prod _{i\in [k]}w_{i}}{\root D \of {j}}=z. \end{aligned}$$

Above, j must be a positive integer, which tells us that the gap in products from every partition must take a particular form. This means that for a given D and W, \(\#\textsc {T-PARTITION}\) can only be non-zero on a discrete set of possible values of \(T=z\). Given z, we can find a \(z'>z\) such that the above has no solutions in the interval \((z,z')\). Specifically, solve the above quadratic for \(\root D \of {j}\) (where j may or may not be an integer), let \(j'=\lfloor {j+1}\rfloor >j\), and \(z'=\root D \of {j'}-\frac{\prod _{i}w_{i}}{\root D \of {j'}}\). We use this property twice in the proof.

Define \(P^{z} \equiv \{P\subseteq [k] \mid \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i} \ge z\}\). As described above we can find the interval \((T,T')\) of values above T with no solutions. Then, for every \(c\in (T,T')\):

$$\begin{aligned} \left| \left\{ P\subseteq [k] \mid \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i} = T\right\} \right|&= \left| P^{T}\backslash P^{c}\right| \\&= \frac{1}{T}\left( \sum _{P\in P^{T}\backslash P^{c}}\left( \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i}\right) \right) \\&= \frac{1}{T}\left( \sum _{P\in P^{T}}\left( \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i}\right) - \sum _{P\in P^{c}}\left( \prod _{i\in P}w_{i} - \prod _{i\not \in P}w_{i}\right) \right) . \end{aligned}$$

We now show how to find \(\sum \limits _{P\in P^{z}}\left( \prod \limits _{i\in P}w_{i} - \prod \limits _{i\not \in P}w_{i}\right) \) for any \(z>0\) using the \(\#\textsc {SUM-PARTITION}\) oracle. Once we have this procedure, we can run it for \(z=T\) and \(z=c\) and plug the outputs into the expression above to solve the \(\#\textsc {T-PARTITION}\) problem. We want to set the input r to the \(\#\textsc {SUM-PARTITION}\) oracle such that:

$$ \prod \limits _{i\in P}w_{i}-r\cdot \prod \limits _{i\not \in P}w_{i}\ge 0 \iff \prod \limits _{i\in P}w_{i}-\prod \limits _{i\not \in P}w_{i}\ge z. $$

Solving this expression for r gives:

$$ r_{z}=\frac{4\prod \limits _{i\in [k]}w_{i}}{\left( \sqrt{z^{2}+4\prod \limits _{i\in [k]}w_{i}} - z\right) ^{2}}. $$

Below we check that this setting satisfies the requirement.

$$\begin{aligned} \prod \limits _{i\in P}w_{i}-\frac{4\prod \limits _{i\in [k]}w_{i}}{\left( \sqrt{z^{2}+4\prod \limits _{i\in [k]}w_{i}} - z\right) ^{2}}\cdot \prod \limits _{i\not \in P}w_{i}\ge 0&\iff 1-\frac{4\left( \prod _{i\not \in P}w_{i}\right) ^{2}}{\left( \sqrt{z^{2}+4\prod \limits _{i\in [k]}w_{i}} - z\right) ^{2}} \ge 0\\&\iff \sqrt{z^{2}+4\prod _{i\in [k]}w_{i}}\ge 2\prod _{i\not \in P}w_{i} + z \\&\iff 4\prod _{i\in [k]}w_{i} \ge 4\left( \prod _{i\not \in P}w_{i}\right) ^{2} + 4z\prod _{i\not \in P}w_{i} \\&\iff \prod _{i\in P}w_{i}-\prod _{i\not \in P}w_{i}\ge z. \end{aligned}$$

So we have \(P^{z} = \left\{ P\subseteq [k]\mid \prod _{i\in P}w_{i}-r_{z}\cdot \prod _{i\not \in P}w_{i}\ge 0\right\} \) but this does not necessarily mean that

$$ \sum _{P\in P^{z}}\left( \prod _{i\in P}w_{i}-\prod _{i\not \in P}w_{i}\right) = \sum _{P\in P^{z}}\left( \prod _{i\in P}w_{i}-r_{z}\cdot \prod _{i\not \in P}w_{i}\right) . $$

The sum on the left-hand side without the \(r_{z}\) coefficient is what we actually need to compute. To get this we again use the discreteness of potential solutions to find \(z''\not =z\) such that \(P^{z}=P^{z''}\). We just pick \(z''\) from the interval \((z,z')\) of values above z that cannot possibly contain solutions to \(\#\textsc {T-PARTITION}\).

Running our \(\#\textsc {SUM-PARTITION}\) oracle for \(r_{z}\) and \(r_{z''}\) will output:

$$\begin{aligned}&\sum _{P\in P^{z}}\left( \prod _{i\in P}w_{i}-r_{z}\cdot \prod _{i\not \in P}w_{i}\right) \\&\sum _{P\in P^{z}}\left( \prod _{i\in P}w_{i}-r_{z''}\cdot \prod _{i\not \in P}w_{i}\right) \end{aligned}$$

This is just a system of two equations with two unknowns and it can be solved for \(\sum _{P\in P^{z}}\prod _{i\in P}w_{i}\) and \(\sum _{P\in P^{z}}\prod _{i\not \in P}w_{i}\) separately. Then we can reconstruct \(\sum _{P\in P^{z}}\left( \prod _{i\in P}w_{i}-\prod _{i\not \in P}w_{i}\right) \). Running this procedure for \(z=T\) and \(z=c\) gives us all of the information we need to count the number of solutions to the \(\#\textsc {T-PARTITION}\) instance we were given. We can solve \(\#\textsc {T-PARTITION}\) in polynomial time with four calls to a \(\#\textsc {SUM-PARTITION}\) oracle. Therefore \(\#\textsc {SUM-PARTITION}\) is \(\#P\)-hard.

Now we prove that computing \(\text {OptComp}\) is \(\#P\)-complete.

Proof

(Proof of Theorem 1.6 ). We have already shown that computing \(\text {OptComp}\) is \(\#P\)-easy. Here we prove that it is also \(\#P\)-hard, thereby proving \(\#P\)-completeness.

Given an instance D\(W=\{w_{1},\ldots ,w_{k}\}, r\) of \(\#\textsc {SUM-PARTITION}\), where \(\forall i\in [k]\), \(w_{i}\) is the Dth root of an integer and \(\prod _{i}w_{i}\le c\), set \(\epsilon _{i}=\ln (w_{i})~ \forall i\in [k]\), \(\delta _{1}=\delta _{2}=\ldots \delta _{k}=0\) and \(\epsilon _{g}=\ln (r)\). Note that \(\sum _{i}\epsilon _{i}=\ln \left( \prod _{i}w_{i}\right) \le \ln (c)\). Since we can take c to be an arbitrary constant greater than 1, we can ensure that \(\sum _{i}\epsilon _{i}\le \epsilon \) for an arbitrary \(\epsilon >0\).

Again we will use the version of \(\mathrm {OptComp}\) that takes \(\epsilon _{g}\) as input and outputs \(\delta _{g}\). After using an \(\mathrm {OptComp}\) oracle to find \(\delta _{g}\) we know the optimal composition Eq. 1 from Theorem 1.5 is satisfied:

$$ \frac{1}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} = 1-\frac{1-\delta _{g}}{\prod _{i=1}^{k}{(1-\delta _{i})}}=\delta _{g}~. $$

Thus we can compute:

$$\begin{aligned} \delta _{g}\cdot \prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}&=\sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} \\&= \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ \prod _{i\in S}w_{i} - r\cdot \prod _{i\not \in S}w_{i}, 0\right\} ~. \end{aligned}$$

This last expression is exactly the solution to the instance of \(\#\textsc {SUM-PARTITION}\) we were given. We solved \(\#\textsc {SUM-PARTITION}\) in polynomial time with one call to an \(\mathrm {OptComp}\) oracle. Therefore computing \(\mathrm {OptComp}\) is \(\#P\)-hard.

5 Approximation of OptComp

Although we cannot hope to efficiently compute the optimal composition for a general set of differentially private algorithms (assuming P\(\not =\)NP or even FP\(\not =\#\)P), we show in this section that we can approximate \(\mathrm {OptComp}\) arbitrarily well in polynomial time.

Theorem 1.7

(Restated). There is a polynomial-time algorithm that given \(\epsilon _{1},\ldots ,\epsilon _{k}\ge 0, \delta _{1},\ldots \delta _{k}, \delta _{g} \in [0,1),\) and \(\eta >0\), outputs \(\epsilon ^{*}\) where

$$\begin{aligned} \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), \delta _{g})\le \epsilon ^{*}\le \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), e^{-\eta /2}\cdot \delta _{g})+\eta ~. \end{aligned}$$

The algorithm runs in \(O\left( \log \left( \frac{k}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \frac{k^{2}}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \) time assuming constant-time arithmetic operations.

We prove this theorem using the following three lemmas:

Lemma 5.1

Given non-negative integers \(a_{1},\ldots ,a_{k}\), B and weights \(w_{1},\ldots ,w_{k}\in \mathbb {R}\), one can compute

$$\sum _{\begin{array}{c} S\subseteq [k]~\mathrm {s.t.}\\ \sum \limits _{i\in S}a_{i}\le B~~ \end{array}}\prod _{i\in S}w_{i}$$

in time O(Bk).

Notice that the constraint in Lemma 5.1 is the same one that characterizes knapsack problems. Indeed, the algorithm we give for computing \(\sum _{S\subseteq [k]}\prod _{i\in S}w_{i}\) is a slight modification of the known pseudo-polynomial time algorithm for counting knapsack solutions, which uses dynamic programming. Next we show that we can use this algorithm to approximate \(\mathrm {OptComp}\).

Lemma 5.2

Given \(\epsilon _{1},\ldots ,\epsilon _{k}, \epsilon ^{*}\ge 0, \delta _{1},\ldots \delta _{k}, \delta _{g} \in [0,1),\) if \(\epsilon _{i}=a_{i}\epsilon _{0} ~\forall i\in \{1,\ldots ,k\}\) for non-negative integers \(a_{i}\) and some \(\epsilon _{0}>0\), then there is an algorithm that determines whether or not \(\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),\delta _{g}) \le \epsilon ^{*}\) that runs in time \(O\left( \frac{k}{\epsilon _{0}}\sum _{i=1}^{k}\epsilon _{i}\right) \).

In other words, if the \(\epsilon \) values we are given are all integer multiples of some \(\epsilon _{0}\), we can determine whether or not the composition of those privacy parameters is \((\epsilon ^{*},\delta _{g})\)-DP in pseudo-polynomial time for every \(\epsilon ^{*}\ge 0\). This means that given any inputs to \(\mathrm {OptComp}\), if we discretize and polynomially bound the \(\epsilon _{i}\)’s, then we can check if the parameters satisfy any global privacy guarantee in polynomial time. Once we have this, we only need to run binary search over values of \(\epsilon ^{*}\) to find the optimal one. In other words, we can solve \(\mathrm {OptComp}\) exactly for a slightly different set of \(\epsilon _{i}\)’s. The next lemma tells us that the output of \(\mathrm {OptComp}\) on this different set of \(\epsilon _{i}\)’s can be used as a good approximation to \(\mathrm {OptComp}\) on the original \(\epsilon _{i}\)’s.

Lemma 5.3

For all \(\epsilon _{1},\ldots ,\epsilon _{k}, c\ge 0\) and \(\delta _{1},\ldots ,\delta _{k},\delta _{g}\in [0,1)\):

$$ \mathrm {OptComp}((\epsilon _{1}+c,\delta _{1}),\ldots ,(\epsilon _{k}+c,\delta _{k}),\delta _{g})\le \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),e^{-kc/2}\cdot \delta _{g})+kc~. $$

Next we prove the three lemmas and then show that Theorem 1.7 follows.

Proof

(Proof of Lemma 5.1 ). We modify Dyer’s algorithm for approximately counting solutions to knapsack problems [6]. The algorithm uses dynamic programming. Given non-negative integers \(a_{1},\ldots ,a_{k}\), B and weights \(w_{1},\ldots ,w_{k}\in \mathbb {R}\), define

$$F(r,s) = \sum _{\begin{array}{c} S\subseteq [r]~\mathrm {s.t.}\\ \sum \limits _{i\in S}a_{i}\le s~~ \end{array}}\prod _{i\in S}w_{i}~.$$

We want to compute F(kB). We can find this by tabulating F(rs) for \((0\le r\le k,~ 0\le s\le B)\) using the recursion:

$$\begin{aligned} F(r,s)= {\left\{ \begin{array}{ll} 1 &{}\text {if}~r=0\\ F(r-1,s) + w_{r}F(r-1, s-a_{r}) &{}\text {if}~r>0~\text {and}~a_{r}\le s\\ F(r-1,s) &{}\text {if}~r>0~\text {and}~~a_{r}>s. \end{array}\right. } \end{aligned}$$

Each cell F(rs) in the table can be computed in constant time given earlier cells \(F(r',s')\) where \(r'<r\). Thus filling the entire table takes time O(Bk).

Proof

(Proof of Lemma 5.2 ). Given \(\epsilon _{1},\ldots ,\epsilon _{k}, \epsilon ^{*}\ge 0\) such that \(\epsilon _{i}=a_{i}\epsilon _{0} ~\forall i\in \{1,\ldots ,k\}\) for non-negative integers \(a_{i}\) and some \(\epsilon _{0}>0\), and \(\delta _{1},\ldots \delta _{k}, \delta _{g} \in [0,1)\), Theorem 1.5 tells us that answering whether or not

$$\begin{aligned} \mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),\delta _{g}) \le \epsilon ^{*} \end{aligned}$$

is equivalent to answering whether or not the following inequality holds:

$$ \frac{1}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon ^{*}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} \le 1-\frac{1-\delta _{g}}{\prod _{i=1}^{k}{(1-\delta _{i})}} ~. $$

The right-hand side and the coefficient on the sum are easy to compute given the inputs so in order to check the inequality, we will show how to compute the sum. Define

$$\begin{aligned} K&=\left\{ T\subseteq [k] \mid \sum \limits _{i\not \in T}\epsilon _{i}\ge \epsilon ^{*}+\sum \limits _{i\in T}\epsilon _{i}\right\} \\&=\left\{ T\subseteq [k] \mid \sum \limits _{i\in T}\epsilon _{i}\le \left( \sum \limits _{i=1}^{k}\epsilon _{i}-\epsilon ^{*}\right) /2\right\} \\&=\left\{ T\subseteq [k] \mid \sum \limits _{i\in T}a_{i}\le B\right\} ~\mathrm {for}~B=\left\lfloor {\left( \sum \limits _{i=1}^{k}\epsilon _{i}-\epsilon ^{*}\right) /2\epsilon _{0}}\right\rfloor \end{aligned}$$

and observe that by setting \(T=S^{\mathsf {c}}\), we have

$$ \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon ^{*}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} = \sum _{T\in K}\left( \left( e^{\sum \limits _{i=1}^{k}\epsilon _{i}}\cdot \prod \limits _{i\in T}e^{-\epsilon _{i}}\right) -\left( e^{\epsilon ^{*}}\cdot \prod \limits _{i\in T}e^{\epsilon _{i}}\right) \right) . $$

We just need to compute this last expression and we can do it for each term separately since K is a set of knapsack solutions. Specifically, setting \(w_{i}=e^{-\epsilon _{i}}~\forall i\in [k]\), Lemma 5.1 tells us that we can compute \(\sum _{T\subseteq [k]}\prod _{i\in T}w_{i}\) subject to \(\sum _{i\in T}a_{i}\le B\), which is equivalent to \(\sum _{T\in K}\prod _{i\in T}e^{-\epsilon _{i}}\).

To compute \(\sum _{T\in K}\prod _{i\in T}e^{\epsilon _{i}}\), we instead set \(w_{i}=e^{\epsilon _{i}}\) and run the same procedure. Since we used the algorithm from Lemma 5.1, the running time is \(O(Bk)=O\left( \frac{k}{\epsilon _{0}}\sum _{i=1}^{k}\epsilon _{i}\right) \).

Proof

(Proof of Lemma 5.3 ). Let \(\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}),e^{-kc/2}\cdot \delta _{g})=\epsilon _{g}\). From Eq. 1 in Theorem 1.5 we know:

$$ \frac{1}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} \le 1-\frac{1-e^{-kc/2}\cdot \delta _{g}}{\prod _{i=1}^{k}{(1-\delta _{i})}} ~. $$

Multiplying both sides by \(e^{kc/2}\) gives:

$$\begin{aligned} \frac{e^{kc/2}}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\}&\le e^{kc/2}\cdot \left( 1-\frac{1-e^{-kc/2}\cdot \delta _{g}}{\prod _{i=1}^{k}{(1-\delta _{i})}}\right) \\&\le 1-\frac{1-\delta _{g}}{\prod _{i=1}^{k}{(1-\delta _{i})}}~. \end{aligned}$$

The above inequality together with Theorem 1.5 means that showing the following will complete the proof:

$$\begin{aligned}&\sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}(\epsilon _{i}+c)} - e^{\epsilon _{g}+kc}\cdot e^{\sum \limits _{i\not \in S}(\epsilon _{i}+c)}, 0\right\} \\&\qquad \le \frac{e^{kc/2}\cdot \prod _{i=1}^{k}{(1+e^{\epsilon _{i}+c})}}{\prod _{i=1}^{k}{(1+e^{\epsilon _{i}})}} \sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} . \end{aligned}$$

Since \((1+e^{\epsilon _{i}+c})/(1+e^{\epsilon _{i}})\ge e^{c/2}\) for every \(\epsilon _{i}>0\), it suffices to show:

$$\begin{aligned}&\sum _{S\subseteq \{1,\ldots ,k\}}\max \left\{ e^{\sum \limits _{i\in S}(\epsilon _{i}+c)} - e^{\epsilon _{g}+kc}\cdot e^{\sum \limits _{i\not \in S}(\epsilon _{i}+c)}, 0\right\} \le \\&\qquad \qquad {\sum _{S\subseteq \{1,\ldots ,k\}}}e^{kc}\cdot \max \left\{ e^{\sum \limits _{i\in S}\epsilon _{i}} - e^{\epsilon _{g}}\cdot e^{\sum \limits _{i\not \in S}\epsilon _{i}}, 0\right\} . \end{aligned}$$

This inequality holds term by term. If a right-hand term is zero \(\left( \sum _{i\in S}\epsilon _{i}\le \epsilon _{g}+\sum _{i\not \in S}\epsilon _{i}\right) \), then so is the corresponding left-hand term \(\left( \sum _{i\in S}(\epsilon _{i}+c)\le \epsilon _{g}+kc+\sum _{i\not \in S}(\epsilon _{i}+c)\right) \). For the nonzero terms, the factor of \(e^{kc}\) ensures that the right-hand terms are larger than the left-hand terms.

Proof

(Proof of Theorem 1.7 ). Lemma 5.2 tells us that we can determine whether a set of privacy parameters satisfies some global differential privacy guarantee if the \(\epsilon \) values are discretized. Notice that then we can solve \(\mathrm {OptComp}\) exactly for a discretized set of \(\epsilon \) values by running binary search over values of \(\epsilon ^{*}\) until we find the minimum \(\epsilon ^{*}\) that satisfies \((\epsilon ^{*},\delta _{g})\)-DP.

Given \(\epsilon _{1},\ldots ,\epsilon _{k}, \epsilon ^{*}\), and an additive error parameter \(\eta >0\), set \(a_{i}=\left\lfloor {\frac{k}{\eta }\epsilon _{i}}\right\rfloor , \epsilon _{i}'=\frac{\eta }{k}\cdot a_{i}~\forall i\in [k]\). With these settings, the \(a_{i}\)’s are non-negative integers and the \(\epsilon _{i}'\) values are all integer multiples of \(\epsilon _{0}=\eta /k\). Lemma 5.2 tells us that we can determine if the new privacy parameters with \(\epsilon '\) values satisfy \((\epsilon ^{*},\delta _{g})\)-DP in time \(O\left( \frac{k^{2}}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \). Running binary search over values of \(\epsilon ^{*}\) will then compute \(\mathrm {OptComp}((\epsilon _{1}',\delta _{1}),\ldots ,(\epsilon _{k}',\delta _{k}),\delta _{g})=\epsilon _{g}'\) exactly in time \(O\left( \log \left( \frac{k}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \frac{k^{2}}{\eta }\sum _{i=1}^{k}\epsilon _{i}\right) \).

Notice that \(\epsilon _{i} - \eta /k \le \epsilon '_{i} \le \epsilon _{i} ~\forall i\in [k]\). Lemma 5.3 says that the outputted \(\epsilon _{g}'\) is at most \(\mathrm {OptComp}((\epsilon _{1},\delta _{1}),\ldots ,(\epsilon _{k},\delta _{k}), e^{-\eta /2}\cdot \delta _{g})+\eta \) as desired.