Skip to main content
Log in

Free gap estimates from the exponential mechanism, sparse vector, noisy max and related algorithms

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Private selection algorithms, such as the exponential mechanism, noisy max and sparse vector, are used to select items (such as queries with large answers) from a set of candidates, while controlling privacy leakage in the underlying data. Such algorithms serve as building blocks for more complex differentially private algorithms. In this paper we show that these algorithms can release additional information related to the gaps between the selected items and the other candidates for free (i.e., at no additional privacy cost). This free gap information can improve the accuracy of certain follow-up counting queries by up to 66%. We obtain these results from a careful privacy analysis of these algorithms. Based on this analysis, we further propose novel hybrid algorithms that can dynamically save additional privacy budget.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. This was a surprising result given the number of incorrect attempts at improving SVT based on flawed manual proofs [33] and shows the power of automated program verification techniques.

  2. The notion of adjacency depends on the application. Some papers define it as D can be obtained from \(D^\prime \) by modifying one record [18] or by adding/deleting one record [15].

  3. That is, for each input D, there might be some random vectors H for which M does not terminate, but the total probability of these vectors is 0, so we can ignore them.

  4. In our algorithm, we set \(\sigma \) to be the standard deviation of the noise distribution.

  5. In the case of monotonic queries, if \(\forall i: q_i \ge q^\prime _i\), then the alignment changes slightly: We set \(\eta ^\prime =\eta \) (the random variable added to the threshold) and set the adjustment to noise in the winning “if” branches to \(q_i-q^\prime _i\) instead of \(1+q_i-q^\prime _i\). (Hence cost terms become \(|q_i-q^\prime _i|\) instead of \(|1+q_i-q^\prime _i|\).) If \(\forall i: q_i \le q^\prime _i\) then we keep the original alignment but in the cost calculation we note that \(|1+q_i-q^\prime _i|\le 1\) (due to the monotonicity and sensitivity).

  6. Selecting thresholds for SVT in experiments is difficult, but we feel this may be fairer than averaging the answer to the top kth and \(k+1\)th queries as was done in prior work [33].

References

  1. Abadi, M., Chu, A., Goodfellow, I., McMahan, H.B., Mironov, I., Talwar, K., Zhang, L.: Deep learning with differential privacy. In: CCS (2016)

  2. Abowd, J.M.: The us census bureau adopts differential privacy. In: KDD (2018)

  3. Albarghouthi, A., Hsu, J.: Synthesizing coupling proofs of differential privacy. In: POPL (2017)

  4. Barthe, G., Gaboardi, M., Gregoire, B., Hsu, J., Strub, P.Y.: Proving differential privacy via probabilistic couplings. In: LICS (2016)

  5. Beimel, A., Nissim, K., Stemmer, U.: Private learning and sanitization: pure vs. approximate differential privacy. Theory Comput. 12(1), 1–61 (2016)

    Article  MATH  Google Scholar 

  6. Bhaskar, R., Laxman, S., Smith, A., Thakurta, A.: Discovering frequent patterns in sensitive data. In: KDD (2010)

  7. Bittau, A., Erlingsson, U., Maniatis, P., Mironov, I., Raghunathan, A., Lie, D., Rudominer, M., Kode, U., Tinnes, J., Seefeld, B.: Prochlo: strong privacy for analytics in the crowd. In: SOSP (2017)

  8. Bun, M., Steinke, T.: Concentrated differential privacy: simplifications, extensions, and lower bounds. In: TCC (2016)

  9. Bureau, U.S.C.: On the map: longitudinal employer-household dynamics. https://lehd.ces.census.gov/applications/help/onthemap.html#!confidentiality_protection

  10. Chaudhuri, K., Hsu, D., Song, S.: The large margin mechanism for differentially private maximization. In: NIPS (2014)

  11. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12(Mar), 1069–1109 (2011)

    MATH  Google Scholar 

  12. Chen, Y., Machanavajjhala, A., Reiter, J.P., Barrientos, A.F.: Differentially private regression diagnostics. In: ICDM (2016)

  13. Ding, B., Kulkarni, J., Yekhanin, S.: Collecting telemetry data privately. In: NIPS (2017)

  14. Ding, Z., Wang, Y., Zhang, D., Kifer, D.: Free gap information from the differentially private sparse vector and noisy max mechanisms. In: PVLDB (2019)

  15. Dwork, C.: Differential privacy. In: ICALP (2006)

  16. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 486–503. Springer (2006)

  17. Dwork, C., Lei, J.: Differential privacy and robust statistics. In: STOC (2009)

  18. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference. Springer (2006)

  19. Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)

    MATH  Google Scholar 

  20. Erlingsson, Ú., Feldman, V., Mironov, I., Raghunathan, A., Talwar, K., Thakurta, A.: Amplification by shuffling: from local to central differential privacy via anonymity. In: SODA (2019)

  21. Erlingsson, Ú., Pihur, V., Korolova, A.: Rappor: Randomized aggregatable privacy-preserving ordinal response. In: CCS (2014)

  22. Fanaeepour, M., Rubinstein, B.I.P.: Histogramming privately ever after: Differentially-private data-dependent error bound optimisation. In: ICDE (2018)

  23. Geng, Q., Viswanath, P.: The optimal mechanism in differential privacy. In: ISIT (2014)

  24. Ghosh, A., Roughgarden, T., Sundararajan, M.: Universally utility-maximizing privacy mechanisms. In: STOC. pp. 351–360 (2009)

  25. Gumbel, E.: Statistical Theory of Extreme Values and Some Practical Applications: A Series of Lectures. Applied Mathematics Series, U.S. Government Printing Office, Washington (1954)

    MATH  Google Scholar 

  26. Haney, S., Machanavajjhala, A., Abowd, J.M., Graham, M., Kutzbach, M., Vilhuber, L.: Utility cost of formal privacy for releasing national employer–employee statistics. In: SIGMOD (2017)

  27. Hardt, M., Ligett, K., McSherry, F.: A simple and practical algorithm for differentially private data release. In: NIPS (2012)

  28. Johnson, N., Near, J.P., Song, D.: Towards practical differential privacy for SQL queries. In: PVLDB (2018)

  29. Kotsogiannis, I., Machanavajjhala, A., Hay, M., Miklau, G.: Pythia: data dependent differentially private algorithm selection. In: SIGMOD (2017)

  30. Lehmann, E., Casella, G.: Theory of Point Estimation. Springer, Berlin (1998)

    MATH  Google Scholar 

  31. Ligett, K., Neel, S., Roth, A., Waggoner, B., Wu, S.Z.: Accuracy first: selecting a differential privacy level for accuracy constrained ERM. In: NIPS (2017)

  32. Liu, J., Talwar, K.: Private selection from private candidates (2018). arXiv preprint arXiv:1811.07971

  33. Lyu, M., Su, D., Li, N.: Understanding the sparse vector technique for differential privacy. In: PVLDB (2017)

  34. Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: from theory to practice on the map. In: ICDE (2008)

  35. Maddison, C.J., Tarlow, D., Minka, T.: A\(\ast \) sampling. In: NIPS (2014)

  36. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: FOCS (2007)

  37. McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: SIGMOD (2009)

  38. Mironov, I.: Rényi differential privacy. In: 30th IEEE Computer Security Foundations Symposium. CSF (2017)

  39. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)

    MATH  Google Scholar 

  40. Papernot, N., Song, S., Mironov, I., Raghunathan, A., Talwar, K., Úlfar Erlingsson: scalable private learning with pate. In: ICLR (2018)

  41. Raskhodnikova, S., Smith, A.D.: Lipschitz extensions for node-private graph statistics and the generalized exponential mechanism. In: FOCS (2016)

  42. Tang, J., Korolova, A., Bai, X., Wang, X., Wang, X.: Privacy loss in apple’s implementation of differential privacy. In: 3rd Workshop on the Theory and Practice of Differential Privacy at CCS (2017)

  43. Team, A.D.P., Team: Learning with privacy at scale. Appl. Mach. Learn. J. 1(8), 1–25 (2017)

    Google Scholar 

  44. Thakurta, A.G., Smith, A.: Differentially private feature selection via stability arguments, and the robustness of the lasso. In: COLT (2013)

  45. Wang, Y., Ding, Z., Wang, G., Kifer, D., Zhang, D.: Proving differential privacy with shadow execution. In: PLDI (2019)

  46. Zhang, D., Kifer, D.: Lightdp: towards automating differential privacy proofs. In: POPL (2017)

  47. Zhang, D., McKenna, R., Kotsogiannis, I., Hay, M., Machanavajjhala, A., Miklau, G.: Ektelo: A framework for defining differentially-private computations. In: SIGMOD (2018)

Download references

Acknowledgements

This work was supported by NSF Awards CNS-1702760 and CNS-1931686.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zeyu Ding.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Proofs

A Proofs

1.1 A.1 Proof of Theorem 4 (BLUE)

Proof

Let \(q_1, \ldots , q_k\) be the true answers to the k queries selected by Noisy Top-K with gap algorithm. Let \(\alpha _i\) be the estimate of \(q_i\) using Laplace mechanism, and \(g_i\) be the estimate of the gap between \(q_i\) and \(q_{i+1}\) from Noisy Top-K with gap.

Recall that \(\alpha _i = q_i + \xi _i\) and \(g_i = q_i + \eta _i - q_{i+1} - \eta _{i+1}\) where \(\xi _i\) and \(\eta _i\) are independent Laplacian random variables. Assume without loss of generality that \({{\,\mathrm{Var}\,}}(\xi _i)=\sigma ^2\) and \({{\,\mathrm{Var}\,}}(\eta _i) = \lambda \sigma ^2\). Write in vector notation

$$\begin{aligned} {\varvec{q}}= & {} \begin{bmatrix}q_1 \\ \vdots \\ q_k \end{bmatrix}, {\varvec{\xi }}= \begin{bmatrix}\xi _1 \\ \vdots \\ \xi _k \end{bmatrix}, {\varvec{\eta }}= \begin{bmatrix}\eta _1 \\ \vdots \\ \eta _k \end{bmatrix}, {\varvec{\alpha }}= \begin{bmatrix}\alpha _1 \\ \vdots \\ \alpha _k \end{bmatrix},\\ {\varvec{g}}= & {} \begin{bmatrix}g_1 \\ \vdots \\ g_{k-1} \end{bmatrix}, \end{aligned}$$

then \({\varvec{\alpha }}= {\varvec{q}}+ {\varvec{\xi }}\) and \({\varvec{g}}= N({\varvec{q}}+{\varvec{\eta }})\) where

figure g

Our goal is then to find the best linear unbiased estimate (BLUE) \({\varvec{\beta }}\) of \({\varvec{q}}\) in terms of \({\varvec{\alpha }}\) and \({\varvec{g}}\). In other words, we need to find a \(k\times k\) matrix X and a \(k\times (k-1)\) matrix Y such that

$$\begin{aligned} {\varvec{\beta }}=X{\varvec{\alpha }}+ Y{\varvec{g}}\end{aligned}$$
(6)

with \(E(\left\Vert {\varvec{\beta }}- {\varvec{q}}\right\Vert ^2) \) as small as possible. Unbiasedness implies that \(\forall {\varvec{q}}, E({\varvec{\beta }}) = X{\varvec{q}}+ YN{\varvec{q}}= {\varvec{q}}\). Therefore \(X+YN = I_k\) and thus

$$\begin{aligned} X = I_k - YN. \end{aligned}$$
(7)

Plugging this into (6), we have \({\varvec{\beta }}= (I_k - YN){\varvec{\alpha }}+ Y{\varvec{g}}= {\varvec{\alpha }}-Y(N{\varvec{\alpha }}- {\varvec{g}})\). Recall that \({\varvec{\alpha }}= {\varvec{q}}+ {\varvec{\xi }}\) and \({\varvec{g}}= N({\varvec{q}}+{\varvec{\eta }})\), we have \(N{\varvec{\alpha }}-{\varvec{g}}= N({\varvec{q}}+ {\varvec{\xi }}- {\varvec{q}}- {\varvec{\eta }}) = N({\varvec{\xi }}- {\varvec{\eta }})\). Thus

$$\begin{aligned} {\varvec{\beta }}= {\varvec{\alpha }}- YN({\varvec{\xi }}- {\varvec{\eta }}). \end{aligned}$$
(8)

Write \({\varvec{\theta }}= N({\varvec{\xi }}- {\varvec{\eta }})\), then we have \( {\varvec{\beta }}- {\varvec{q}}= {\varvec{\alpha }}- {\varvec{q}}- Y{\varvec{\theta }}= {\varvec{\xi }}- Y{\varvec{\theta }}\). Therefore, finding the BLUE is equivalent to solving the optimization problem \(Y = \arg \min \varPhi \) where

$$\begin{aligned} \varPhi&= E(\left\Vert {\varvec{\xi }}- Y{\varvec{\theta }}\right\Vert ^2) = E(({\varvec{\xi }}- Y{\varvec{\theta }})^T({\varvec{\xi }}- Y{\varvec{\theta }}))\\&= E({\varvec{\xi }}^T{\varvec{\xi }}- {\varvec{\xi }}^TY{\varvec{\theta }}-{\varvec{\theta }}^TY^T{\varvec{\xi }}+ {\varvec{\theta }}^TY^TY{\varvec{\theta }}) \end{aligned}$$

Taking the partial derivatives of \(\varPhi \) w.r.t Y, we have

$$\begin{aligned} \frac{\partial \varPhi }{\partial Y}&= E(\varvec{0} - {\varvec{\xi }}{\varvec{\theta }}^T -{\varvec{\xi }}{\varvec{\theta }}^T + Y({\varvec{\theta }}{\varvec{\theta }}^T + {\varvec{\theta }}{\varvec{\theta }}^T)) \end{aligned}$$

By setting \(\frac{\partial \varPhi }{\partial Y} = 0\) we have \(YE({\varvec{\theta }}{\varvec{\theta }}^T) = E({\varvec{\xi }}{\varvec{\theta }}^T)\) thus

$$\begin{aligned} Y = E({\varvec{\xi }}{\varvec{\theta }}^T) E({\varvec{\theta }}{\varvec{\theta }}^T)^{-1}. \end{aligned}$$
(9)

Recall that \(({\varvec{\xi }}{\varvec{\theta }}^T)_{ij} = \xi _i(\xi _j -\xi _{j+1} -\eta _j + \eta _{j+1} )\), we have

$$\begin{aligned} E({\varvec{\xi }}{\varvec{\theta }}^T)_{ij} = {\left\{ \begin{array}{ll} E(\xi _i^2) = {{\,\mathrm{Var}\,}}(\xi _i) = \sigma ^2 &{} i = j \\ -E(\xi _i^2) = -{{\,\mathrm{Var}\,}}(\xi _i) = -\sigma ^2 &{} i = j+1 \\ 0 &{}\text {otherwise} \end{array}\right. } \end{aligned}$$

Hence

$$\begin{aligned} E({\varvec{\xi }}{\varvec{\theta }}^T) = \sigma ^2\begin{bmatrix} 1 &{} &{} \\ -1 &{} \ddots &{} \\ &{} \ddots &{}1 \\ &{} &{} -1 \end{bmatrix}_{k\times (k-1)} = \sigma ^2N^T. \end{aligned}$$

Similarly, we have

$$\begin{aligned} ({\varvec{\theta }}{\varvec{\theta }}^T)_{ij}&= (\xi _i-\xi _{i+1} - \eta _i + \eta _{i+1})(\xi _j - \xi _{j+1} - \eta _j + \eta _{j+1}) \\&= \xi _i\xi _j + \xi _{i+1}\xi _{j+1} - \xi _{i}\xi _{j+1} -\xi _{i+1}\xi _{j} \\&\quad +\eta _i\eta _j + \eta _{i+1}\eta _{j+1} - \eta _{i}\eta _{j+1} -\eta _{i+1}\eta _{j} \\&\quad -(\xi _i -\xi _{i+1})(\eta _j-\eta _{j+1}) \\&\quad -(\eta _i -\eta _{i+1})(\xi _j-\xi _{j+1}) \end{aligned}$$

Thus

$$\begin{aligned}&E({\varvec{\theta }}{\varvec{\theta }}^T)_{ij}\\&= {\left\{ \begin{array}{ll} E(\xi _i^2 + \xi _{i+1}^2 + \eta _i^2 + \eta _{i+1}^2) = 2(1+\lambda )\sigma ^2 &{} i = j \\ E(-\xi _i^2 - \eta _i^2) = -(1+\lambda )\sigma ^2 &{} i = j+1 \\ E(-\xi _j^2 -\eta _j^2) = -(1+\lambda )\sigma ^2 &{} i = j-1 \\ 0 &{}\text {otherwise} \end{array}\right. } \end{aligned}$$

Hence

$$\begin{aligned} E({\varvec{\theta }}{\varvec{\theta }}^T) =(1+\lambda )\sigma ^2 \begin{bmatrix} 2 &{} -1 &{} &{} &{} &{}\\ -1 &{} 2 &{} -1 &{} &{} &{}\\ &{} \ddots &{} \ddots &{} \ddots &{} &{} \\ &{} &{} -1 &{} 2 &{} -1 \\ &{} &{} &{} -1 &{} 2 &{} \end{bmatrix}_{(k-1)\times (k-1)}. \end{aligned}$$

It can be directly computed that \(E({\varvec{\theta }}{\varvec{\theta }}^T)^{-1} \) is a symmetric matrix whose lower triangular part is

$$\begin{aligned} \frac{1}{k(1+\lambda )\sigma ^2}\begin{bmatrix} (k-1)\cdot 1 &{} \cdots &{} \cdots &{} \cdots &{} \cdots \\ (k-2)\cdot 1 &{} (k-2)\cdot 2 &{} \cdots &{} \cdots &{} \cdots \\ (k-3)\cdot 1 &{} (k-3)\cdot 2 &{} (k-3)\cdot 3 &{} \cdots &{} \cdots \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1\cdot 1 &{} 1\cdot 2 &{} 1\cdot 3 &{} \cdots &{} 1\cdot (k-1), \\ \end{bmatrix} \end{aligned}$$

i.e., \(E({\varvec{\theta }}{\varvec{\theta }}^T)^{-1}_{ij} = E({\varvec{\theta }}{\varvec{\theta }}^T)^{-1}_{ji} = \frac{1}{k(1+\lambda )\sigma ^2}\cdot (k-i)\cdot j\) for all \(1\le i\le j \le k-1\). Therefore, \(Y = E({\varvec{\xi }}{\varvec{\theta }}^T) E({\varvec{\theta }}{\varvec{\theta }}^T)^{-1}= \)

$$\begin{aligned} \frac{1}{k(1+\lambda )}\left( \begin{bmatrix} k-1 &{} k-2 &{} \cdots &{}1 \\ k-1 &{} k-2 &{} \cdots &{}1 \\ k-1 &{} k-2 &{} \cdots &{}1 \\ \vdots &{} \vdots &{} \ddots &{}\vdots \\ k-1 &{} k-2 &{} \cdots &{} 1 \\ \end{bmatrix} - \begin{bmatrix} 0 &{} 0 &{} \cdots &{}0 \\ k &{} 0 &{} \cdots &{}0 \\ k &{} k &{} \cdots &{}0 \\ \vdots &{} \vdots &{} \ddots &{}0 \\ k &{} k &{} \cdots &{}k \\ \end{bmatrix} \right) _{k\times (k-1)} \end{aligned}$$

Hence

$$\begin{aligned} X = I_k - YN = \frac{1}{k(1+\lambda )}\begin{bmatrix} 1+k\lambda &{} 1 &{} \cdots &{} 1 \\ 1 &{} 1+k\lambda &{} \cdots &{} 1 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} 1 &{} \cdots &{} 1+k\lambda \\ \end{bmatrix}_{k\times k}. \end{aligned}$$

\(\square \)

1.2 A.2 Proof of Corollary 1

Recall that \(\alpha _i = q_i + \xi _i\) and \(g_i = q_i + \eta _i - q_{i+1} - \eta _{i+1}\) where \(\xi _i\) and \(\eta _i\) are independent Laplacian random variables. Assume without loss of generality that \({{\,\mathrm{Var}\,}}(\xi _i)=\sigma ^2\) and \({{\,\mathrm{Var}\,}}(\eta _i) = \lambda \sigma ^2\) as before. From the matrices X and Y in Theorem 4 we have that \(\beta _i = \frac{x_i + y_i}{k(1+\lambda )}\) where

$$\begin{aligned} x_i&= \alpha _1 + \cdots + (1+k\lambda ) \alpha _i + \cdots + \alpha _k\\&= (q_1+\xi _1) + \cdots + (1+k\lambda ) (q_i+\xi _i) + \cdots + (q_k +\xi _k) \end{aligned}$$

and

$$\begin{aligned} y_i&= -g_1 -2g_2 - \cdots - (i-1)g_{i-1} \\&\quad + (k-i) g_i + \ldots + 2g_{k-2} + g_{k-1}\\&= -(q_1 + \eta _1) - (q_2+\eta _2) -\cdots - (q_{i-1} + \eta _{i-1}) \\&\quad + (k-1)(q_i + \eta _i) - (q_{i+1} + \eta _{i+1}) - \cdots - (q_{k} + \eta _k). \end{aligned}$$

Therefore

$$\begin{aligned} {{\,\mathrm{Var}\,}}(x_i)&= \sigma ^2 + \cdots + (1+k\lambda )^2\sigma ^2 + \cdots + \sigma ^2\\&= ( k^2\lambda ^2 + 2k\lambda + k)\sigma ^2 \\ {{\,\mathrm{Var}\,}}(y_i)&= \lambda \sigma ^2+ \cdots + (k-1)^2 \lambda \sigma ^2 + \cdots + \lambda \sigma ^2\\&= (k^2 - k)\lambda \sigma ^2 \end{aligned}$$

and thus \({{\,\mathrm{Var}\,}}(\beta _i) = \frac{{{\,\mathrm{Var}\,}}(x_i) + {{\,\mathrm{Var}\,}}(y_i)}{k^2(1+\lambda )^2} =\frac{1 + k\lambda }{k+k\lambda }\sigma ^2\). Recall that \({{\,\mathrm{Var}\,}}(\alpha _i) = {{\,\mathrm{Var}\,}}(\xi _i) = \sigma ^2\), we have \(\frac{{{\,\mathrm{Var}\,}}(\beta _i)}{{{\,\mathrm{Var}\,}}(\alpha _i)} = \frac{1 + k\lambda }{k+k\lambda }.\)

1.3 A.3 Proof of Lemma 3

The density function of \(\eta _i - \eta \) is \(f_{\eta _i-\eta } (z) = \int _{-\infty }^\infty f_{\eta _i}(x) f_{\eta }(x-z)\,{\mathrm{d}}x =\frac{\epsilon _0\epsilon _*}{4} \int _{-\infty }^\infty e^{-\epsilon _*\left|x\right|} e^{-\epsilon _0\left|x-z\right|}\,{\mathrm{d}}x.\) First consider the case \(\epsilon _0\ne \epsilon _*\). When \(z\ge 0\), we have

$$\begin{aligned}&f_{\eta _i-\eta } (z) \\&\quad = \frac{\epsilon _0\epsilon _*}{4} \int _{-\infty }^\infty e^{-\epsilon _*\left|x\right|} e^{-\epsilon _0\left|x-z\right|}~{\mathrm{d}}x\\&\quad = \frac{\epsilon _0\epsilon _*}{4} \Big ( \int _{-\infty }^0 e^{\epsilon _* x} e^{\epsilon _0(x-z)}~{\mathrm{d}}x ~ \\&\quad \quad +\int _{0}^z e^{-\epsilon _* x} e^{\epsilon _0(x-z)}~{\mathrm{d}}x + \int _{z}^\infty e^{-\epsilon _* x} e^{-\epsilon _0(x-z)}~{\mathrm{d}}x\Big )\\&\quad = \frac{\epsilon _0\epsilon _*}{4} \Big ( \frac{e^{-\epsilon _0z}}{\epsilon _0+\epsilon _*} + \frac{ e^{-\epsilon _*z} - e^{-\epsilon _0z}}{\epsilon _0-\epsilon _*} + \frac{e^{-\epsilon _*z}}{\epsilon _0+\epsilon _*} \Big )\\&\quad = \frac{\epsilon _0\epsilon _* (\epsilon _0 e^{-\epsilon _*z} - \epsilon _* e^{-\epsilon _0z})}{2(\epsilon _0^2 - \epsilon _*^2)} \end{aligned}$$

Thus by symmetry we have that for all \(z\in \mathbb {R}\), \(f_{\eta _i-\eta } (z) = \frac{\epsilon _0\epsilon _* (\epsilon _0 e^{-\epsilon _*\left|z\right|} - \epsilon _* e^{-\epsilon _0\left|z\right|})}{2(\epsilon _0^2-\epsilon _*^2)}\), and

$$\begin{aligned} \mathbb {P}(\eta _i - \eta \ge -t )&= \int _{-t}^{\infty } f_{\eta _i-\eta } (z)~{\mathrm{d}}z = \int _{-t}^{0} f_{\eta _i-\eta } (z)~{\mathrm{d}}z + \frac{1}{2} \\&= 1 - \frac{\epsilon _0^2 e^{-\epsilon _*t} - \epsilon _*^2e^{-\epsilon _0t}}{2(\epsilon _0^2 - \epsilon _*^2)}. \end{aligned}$$

Now if \(\epsilon _0 = \epsilon _*\), by similar computations we have \(f_{\eta _i-\eta } (z) = (\frac{\epsilon _0}{4} + \frac{\epsilon _0^2\left|z\right|}{4})e^{-\epsilon _0\left|z\right|}\) and \( \mathbb {P}(\eta _i - \eta \ge -t ) = 1 - (\frac{2+\epsilon _0t}{4})e^{-\epsilon _0t}. \)

1.4 A.4 Proofs in Sect. 8 (Exp. Mech. with Gap)

A well-known, but inefficient, folklore algorithm for the exponential mechanism is based on the Gumbel-Max trick [25, 35]: Given numbers \(\mu _1,\dots , \mu _n\), add independent Gumbel(0) noise to each and select the index of the largest noisy value. This is the same as sampling the ith item with probability proportional to \(e^{\mu _i}\). Let \({{\,\mathrm{Cat}\,}}(\mu _1,\dots , \mu _n)\) denote the categorical distribution that returns item \(\omega _i\) with probability \(\tfrac{\exp (\mu _i)}{\sum _{j=1}^n\exp (\mu _j)}\). The Gumbel-Max theorem provides distributions for the identity of the noisy maximum and the value of the noisy maximum:

Theorem 9

(The Gumbel-Max Trick [25, 35]) Let \(G_i\), \(\dots \), \(G_n\) be i.i.d. \({{\,\mathrm{Gumbel}\,}}(0)\) random variables and let \(\mu _1\), \(\dots \), \(\mu _n\) be real numbers. Define \(X_i = G_i+\mu _i\). Then

  1. 1.

    The distribution of \(\arg \max _{i} (X_1,\dots , X_n)\) is the same as \({{\,\mathrm{Cat}\,}}(\mu _1, \dots , \mu _n)\).

  2. 2.

    The distribution of \(\max _{i} (X_1,\dots , X_n)\) is the same as the \({{\,\mathrm{Gumbel}\,}}(\ln \sum _{i=1}^n \exp (\mu _i))\) distribution.

Using the Gumbel-Max trick, one can propose an Exponential Mechanism with Gap by replacing Laplace or exponential noise in Noisy Max with Gap with the Gumbel distribution as shown in Algorithm 7. (Boxed items represent gap information.) We first prove the correctness of this algorithm and then show how to replace the Gumbel-max trick with any efficient black box algorithm for the exponential mechanism.

figure h

We first need the following results.

Lemma 9

Let \(\epsilon >0\). Let \(\mu :\mathcal {D}\times \mathcal {R}\rightarrow \mathbb {R}\) be a utility function of sensitivity \(\varDelta _{\mu }\). Define \(\nu :\mathcal {D}\rightarrow \mathbb {R}\) and its sensitivity \(\varDelta _{\nu }\) as

$$\begin{aligned} \nu (D) = \ln \sum _{\omega \in \mathcal {R}}e^{\frac{\epsilon \mu (D,\omega )}{2\varDelta _{\mu }}}, \quad \varDelta _{\nu } = \max _{D\sim D'}\left|\nu (D)-\nu (D')\right|. \end{aligned}$$

Then \(\varDelta _{\nu }\), the sensitivity of \(\nu \), is at most \( \frac{\epsilon }{2}\).

Proof of Lemma 9

From the definition of \(\nu \) we have

$$\begin{aligned} \left|\nu (D) - \nu (D')\right|&= \left|\ln \sum _{\omega \in \mathcal {R}}e^{\tfrac{\epsilon \mu (D,\omega )}{2\varDelta _{\mu }}} - \ln \sum _{\omega \in \mathcal {R}}e^{\tfrac{\epsilon \mu (D^\prime ,\omega )}{2\varDelta _{\mu }}}\right|\\&=\left|\ln \left( {\sum _{\omega \in \mathcal {R}}e^{\tfrac{\epsilon \mu (D,\omega )}{2\varDelta _{\mu }}}}\right) /\left( {\sum _{\omega \in \mathcal {R}}e^{\tfrac{\epsilon \mu (D',\omega )}{2\varDelta _{\mu }}}}\right) \right| \end{aligned}$$

By definition of sensitivity, we have

$$\mu (D',\omega )-\varDelta _{\mu }\le \mu (D,\omega )\le \mu (D',\omega )+\varDelta _{\mu }, \text { and therefore}$$
$$\begin{aligned} e^{-\frac{\epsilon }{2}} \sum _{\omega \in \mathcal {R}}e^{\frac{\epsilon \mu (D',\omega )}{2\varDelta _{\mu }}} \le \sum _{\omega \in \mathcal {R}}e^{\frac{\epsilon \mu (D,\omega )}{2\varDelta _{\mu }}} \le e^{\frac{\epsilon }{2}} \sum _{\omega \in \mathcal {R}}e^{\frac{\epsilon \mu (D',\omega )}{2\varDelta _{\mu }}} \end{aligned}$$

Thus \(\left|\nu (D) - \nu (D')\right| \le \frac{\epsilon }{2}\), and hence \(\varDelta _{\nu } \le \frac{\epsilon }{2}\). \(\square \)

Lemma 10

Let \(f(x;\theta )=\frac{e^{-(x-\theta )}}{(1+e^{-(x-\theta )})^2}\) be the density of the logistic distribution, then \(\left|\ln \frac{ f(x;\theta )}{f(x;\theta ')}\right| \le \left|\theta - \theta '\right|.\)

Proof of Lemma 10

Note that \(\left|\ln \frac{ f(x;\theta )}{f(x;\theta ')}\right|=\left|\ln \frac{ f(x;\theta ^\prime )}{f(x;\theta )}\right|\) so without loss of generality, we can assume that \(\theta \ge \theta ^\prime \) (i.e., the location parameter in the numerator is \(\ge \) the parameter in the denominator). From the formula of f we have \( \tfrac{f(x;\theta )}{f(x;\theta ')} =e^{\theta -\theta '} \cdot \left( \tfrac{1+e^{-x}e^{\theta '}}{1+e^{-x}e^{\theta }}\right) ^2 \). Clearly \(e^{\theta }\ge e^{\theta '}\implies \tfrac{1+e^{-x}e^{\theta '}}{1+e^{-x}e^{\theta }} \le 1\). Also,

$$\begin{aligned} \tfrac{1+e^{-x}e^{\theta '}}{1+e^{-x}e^{\theta }} = \tfrac{e^{\theta '-\theta }(e^{\theta -\theta '}+e^{-x}e^{\theta })}{1+e^{-x}e^{\theta }} \ge \tfrac{e^{\theta '-\theta }(1+e^{-x}e^{\theta })}{1+e^{-x}e^{\theta }} = e^{\theta '-\theta }. \end{aligned}$$

Therefore, \( e^{\theta '-\theta }=e^{\theta -\theta '} \cdot (e^{\theta '-\theta })^2\le \tfrac{f(x;\theta )}{f(x;\theta ')} \le e^{\theta -\theta '}.\) Thus \( \left|\ln \tfrac{f(x;\theta )}{f(x;\theta ')}\right| \le \left|\theta - \theta '\right|. \) \(\square \)

Theorem 10

Algorithm 7 satisfies \(\epsilon \)-differential privacy. Its output distribution is equivalent to selecting \(\omega _s\) with probability proportional to \(\exp \big (\frac{\epsilon \mu (D,\omega _s)}{2\varDelta _\mu }\big )\) and then independently sampling the gap from the logistic distribution (conditional on only sampling nonnegative values) with location parameter \(\theta = \frac{\epsilon \mu (D,\omega _s)}{2\varDelta _\mu } - \ln \sum \limits _{j\ne s}\exp (\frac{\epsilon \mu (D,\omega _j)}{2\varDelta _\mu })\).

Proof of Theorem 10

For \(\omega _i\in \mathcal {R}\), let \(\mu _i=\tfrac{\epsilon \mu (D,\omega _i)}{2\varDelta _{\mu }}\) and \(\mu '_i=\tfrac{\epsilon \mu (D',\omega _i)}{2\varDelta _{\mu }}\). Let \(X_i\sim {{\,\mathrm{Gumbel}\,}}(\mu _i)\) and \(X'_i\sim {{\,\mathrm{Gumbel}\,}}(\mu '_i)\).

We consider the probability of outputting the selected \(\omega _s\) with gap \(\gamma \ge 0\) when D is the input database:

$$\begin{aligned}&P(\omega _s \text { is chosen with gap} \ge \gamma ~|~ D)\\&\quad = \int _\mathbb {R}P(X_s = z+ \gamma ) \prod \limits _{i\ne s}P(X_i \le z)~{\mathrm{d}}z\\&\quad =\int _\mathbb {R}\exp (-(z+\gamma -\mu _s)-e^{-(z+\gamma -\mu _s)}) \prod \limits _{i\ne s}e^{-e^{-(z-\mu _i)}}~{\mathrm{d}}z\\&\quad =\int _\mathbb {R}e^{\mu _s-\gamma }\exp (-z-e^{\mu _s-\gamma }e^{-z}) \prod \limits _{i\ne s}\exp (-e^{\mu _i}e^{-z})~{\mathrm{d}}z\\ \end{aligned}$$

(let \(\mu ^* = \ln (\sum _{i\ne s}e^{\mu _i})\) and \(\theta = \mu _s - \mu ^*\))

$$\begin{aligned}&=\int _\mathbb {R}e^{\mu _s-\gamma }\exp (-z-e^{\mu _s-\gamma }e^{-z}) \exp (-e^{\mu ^*}e^{-z})~{\mathrm{d}}z\\&=\int _\mathbb {R}e^{\mu _s-\gamma }\exp (-z-(e^{\mu _s-\gamma }+e^{\mu ^*})e^{-z}) ~{\mathrm{d}}z\\&=\frac{e^{\mu _s-\gamma }}{e^{\mu _s-\gamma } + e^{\mu ^*}}\exp (-(e^{\mu _s-\gamma }+e^{\mu ^*})e^{-z})\Big |_{-\infty }^{+\infty }\\&=\frac{e^{\mu _s-\gamma }}{e^{\mu _s-\gamma } + e^{\mu ^*}} = \frac{1}{1+e^{-(\mu _s-\gamma -\mu ^*)}} = \frac{1}{1+e^{-(\theta -\gamma )}} \end{aligned}$$

and so

$$\begin{aligned}&P(\omega _s \text { is chosen with gap} \in [0,\gamma ] ~|~ D) \\&\quad = P(\omega _s \text { is chosen}~|~D) \\&\qquad - P(\omega _s \text { is chosen with gap} \ge \gamma ~|~ D)\\&\quad = \frac{e^{\mu _s}}{e^{\mu _s} + e^{\mu ^*}} - \frac{1}{1+e^{-(\mu _s-\gamma -\mu ^*)}} = \frac{1}{1+e^{-\theta }} \\&\qquad - \frac{1}{1+e^{-(\theta -\gamma )}}\\ \end{aligned}$$

Taking the derivative with respect to \(\gamma \), we get the density \(f(\omega _s, \gamma ~|~D)\) of \(\omega _s\) being chosen with gap equal to \(\gamma \):

$$\begin{aligned}&f(\omega _s,\gamma ~|~D) =\frac{\mathrm{d}}{\mathrm{d}\gamma }\left( \frac{1}{1+e^{-\theta }} - \frac{1}{1+e^{-(\theta -\gamma )}}\right) \nonumber \\&\quad = \frac{e^{-(\gamma -\theta )}}{(e^{-(\gamma - \theta )}+1)^2}\mathbf {1}_{[\gamma \ge 0]} \end{aligned}$$
(10)
$$\begin{aligned}&= \frac{e^{\mu _s}}{e^{\mu _s} + e^{\mu ^*}} \left( \frac{e^{-(\gamma -\theta )}}{(e^{-(\gamma - \theta )}+1)^2}\mathbf {1}_{[\gamma \ge 0]}\right) \Big /\frac{e^{\mu _s}}{e^{\mu _s} + e^{\mu ^*}}\nonumber \\&= \frac{e^{\mu _s}}{e^{\mu _s} + e^{\mu ^*}} \left( \frac{e^{-(\gamma -\theta )}}{(e^{-(\gamma - \theta )}+1)^2}\mathbf {1}_{[\gamma \ge 0]}\right) \Big /\frac{1}{1 + e^{-\theta }} \end{aligned}$$
(11)

Now, in Eq. 11, the term \(\frac{e^{\mu _s}}{e^{\mu _s} + e^{\mu ^*}}=\frac{e^{\mu _s}}{e^{\mu _s} + \sum _{i\ne s}e^{\mu _i}}=\frac{e^{\mu _s}}{\sum _i e^{\mu _i}}\) is the probability of selecting \(\omega _s\).

The term \(\frac{e^{-(\gamma -\theta )}}{(e^{-(\gamma - \theta )}+1)^2}\mathbf {1}_{[\gamma \ge 0]}\) is the density of the event that a logistic random variable with location \(\theta \) has value \(\gamma \) and is nonnegative.

Finally, the term \(\frac{1}{1 + e^{-\theta }}\) is the probability that a logistic random variable with location \(\theta \) is nonnegative.

Thus \(\left( \frac{e^{-(\gamma -\theta )}}{(e^{-(\gamma - \theta )}+1)^2}\mathbf {1}_{[\gamma \ge 0]}\right) \Big /\frac{1}{1 + e^{-\theta }}\) is the probability of a logistic random variable having value \(\gamma \) conditioned on it being nonnegative.

Therefore Eq. 11 is the probability of selecting \(\omega _s\) and independently sampling a nonnegative value \(\gamma \) from the conditional logistic distribution location parameter \(\theta = \mu _s-\mu ^*\) (i.e., conditional on it only returning nonnegative values).

Now, recall that \(\mu _i=\frac{\epsilon \mu (D,i)}{2\varDelta _\mu }\), we apply Lemmas 10 and 9 with the help of Eq. 10 to finish the proof:

$$\begin{aligned}&\left|\ln \frac{f(\omega _s, \gamma ~|~D)}{f(\omega _s, \gamma ~|~D^\prime )}\right| \le \left|(\mu _s - \mu ^*) - (\mu _{s}^{\prime } - \mu ^{*\prime })\right|\\&\quad \le |\mu _s -\mu _s^\prime | + |\ln \sum _{i\ne s}e^{\mu _i} - \ln \sum _{i\ne s}e^{\mu ^\prime _i}| \\&\quad \le \epsilon /2 + \epsilon /2 = \epsilon . \end{aligned}$$

\(\square \)

Proof of Theorem 6

The first part follows directly from Theorem 10. Also, from the proof of Theorem 10 the gap \(g_s\) has density \(f(x;\theta ) = \left( \frac{e^{-(x-\theta )}}{(e^{-(x - \theta )}+1)^2}\mathbf {1}_{[x\ge 0]}\right) \Big /\frac{1}{1 + e^{-\theta }}\). Since

$$\begin{aligned}&\int _0^t\frac{e^{-x+\theta }}{(e^{-x+\theta }+1)^2}\cdot x\,{\mathrm{d}}x =\int _0^t\frac{e^{x-\theta }}{(1+e^{x-\theta })^2}\cdot x\,{\mathrm{d}}x\\&\quad =\int _0^t x\cdot \left( \frac{-1}{1+e^{x-\theta }}\right) '\,{\mathrm{d}}x = \frac{-x}{1+e^{x-\theta }}\Big |_0^t \\&\qquad + \int _0^t \frac{1}{1+e^{x-\theta }}\, {\mathrm{d}}x\\&\quad = \frac{-t}{1+e^{t-\theta }} + (x-\ln (1+e^{x-\theta }))\Big |_0^t\\&\quad =\frac{-t}{1+e^{t-\theta }} + t - \ln (1+e^{t-\theta }) + \ln (1+e^{-\theta })\\&\quad =\frac{-t}{1+e^{t-\theta }} +\ln \frac{e^t}{1+e^{t-\theta }} + \ln (1+e^{-\theta })\\ \end{aligned}$$

We have

$$\begin{aligned}&\int _0^\infty \frac{e^{-x+\theta }}{(e^{-x+\theta }+1)^2}\cdot x\,\mathrm{d}x=\lim _{t\rightarrow \infty } \int _0^t\frac{e^{-x+\theta }}{(e^{-x+\theta }+1)^2}\cdot x\,{\mathrm{d}}x\\&\quad =\lim _{t\rightarrow \infty } \left( \frac{-t}{1+e^{t-\theta }} +\ln \frac{e^t}{1+e^{t-\theta }} + \ln (1+e^{-\theta })\right) \\&\quad =0 + \ln (e^\theta ) + \ln (1+e^{-\theta }) = \ln (1+e^\theta ) \end{aligned}$$

Hence \(\mathbb {E}(g_s) = (1+e^{-\theta })\ln (1+e^\theta )\). \(\square \)

Proof of Theorem 7

Assume \(H_0\) is true, i.e., there exists a \(t\ne s\) such that \(\mu (D,\omega _s) < \mu (D,\omega _{t})\). Then

$$\begin{aligned} \theta= & {} \tfrac{\epsilon \mu (D, \omega _s)}{2\varDelta _{\mu }} - \ln \sum \limits _{j\ne s}\exp ({\tfrac{\epsilon \mu (D, \omega _j)}{2\varDelta _{\mu }}})\\\le & {} \tfrac{\epsilon \mu (D,\omega _s)}{2\varDelta _\mu } - \ln \exp \tfrac{\epsilon \mu (D,\omega _{t})}{2\varDelta _\mu } = \tfrac{\epsilon \mu (D,\omega _s)}{2\varDelta _\mu } - \tfrac{\epsilon \mu (D,\omega _{t})}{2\varDelta _\mu } < 0 \end{aligned}$$

Using be the density of the gap from above, we have

$$\begin{aligned}&\mathbb {P}[x \ge \gamma \mid H_0] = (1+e^{-\theta })\int _\gamma ^\infty \frac{e^{-x+\theta }}{(1+e^{-x+\theta })^2}\,{\mathrm{d}}x \\&\quad =(1+e^{-\theta })\int _\gamma ^\infty \frac{e^{x-\theta }}{(1+e^{x-\theta })^2}\,{\mathrm{d}}x \\&\quad =(1+e^{-\theta })\cdot \left( \frac{-1}{1+e^{x-\theta }}\Big |_\gamma ^\infty \right) \\&\quad = \frac{1+e^{-\theta }}{1+e^{\gamma -\theta }} = \frac{e^\theta +1}{e^\theta +e^\gamma } < \frac{2}{1+e^\gamma } \end{aligned}$$

because \(\frac{e^\theta +1}{e^\theta +e^\gamma }\) is an increasing function of \(\theta \) and \(\theta < 0\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, Z., Wang, Y., Xiao, Y. et al. Free gap estimates from the exponential mechanism, sparse vector, noisy max and related algorithms. The VLDB Journal 32, 23–48 (2023). https://doi.org/10.1007/s00778-022-00728-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-022-00728-2

Keywords

Navigation