Skip to main content

On-Average KL-Privacy and Its Equivalence to Generalization for Max-Entropy Mechanisms

  • Conference paper
  • First Online:
Privacy in Statistical Databases (PSD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9867))

Included in the following conference series:

Abstract

We define On-Average KL-Privacy and present its properties and connections to differential privacy, generalization and information-theoretic quantities including max-information and mutual information. The new definition significantly weakens differential privacy, while preserving its minimal design features such as composition over small group and multiple queries as well as closeness to post-processing. Moreover, we show that On-Average KL-Privacy is equivalent to generalization for a large class of commonly-used tools in statistics and machine learning that samples from Gibbs distributions—a class of distributions that arises naturally from the maximum entropy principle. In addition, a byproduct of our analysis yields a lower bound for generalization error in terms of mutual information which reveals an interesting interplay with known upper bounds that use the same quantity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We will formally define these quantities.

  2. 2.

    These assumptions are only for presentation simplicity. The notion of On-Average KL-privacy can naturally handle mixture of densities and point masses.

References

  1. Akaike, H.: Likelihood of a model and information criteria. J. Econometrics 16(1), 3–14 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  2. Altun, Y., Smola, A.J.: Unifying divergence minimization and statistical inference via convex duality. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 139–153. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Anderson, N.: “anonymized” data really isn’t and here’s why not (2009). http://arstechnica.com/tech-policy/2009/09/your-secrets-live-online-in-databases-of-ruin/

  4. Barber, R.F., Duchi, J.C.: Privacy and statistical risk: Formalisms and minimax bounds. arXiv preprint arXiv:1412.4451 (2014)

  5. Bassily, R., Nissim, K., Smith, A., Steinke, T., Stemmer, U., Ullman, J.: Algorithmic stability for adaptive data analysis. arXiv preprint arXiv:1511.02513 (2015)

  6. Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)

    Google Scholar 

  7. Bousquet, O., Elisseeff, A.: Stability and generalization. J. Mach. Learn. Res. 2, 499–526 (2002)

    MathSciNet  MATH  Google Scholar 

  8. Duncan, G.T., Elliot, M., Salazar-González, J.J.: Statistical Confidentiality: Principle and Practice. Springer, New York (2011)

    Book  MATH  Google Scholar 

  9. Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R., Roehrig, S.F.: Disclosure limitation methods and information loss for tabular data. In: Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 135–166 (2001)

    Google Scholar 

  10. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A.: Generalization in adaptive data analysis and holdout reuse. In: Advances in Neural Information Processing Systems (NIPS 2015), pp. 2341–2349 (2015)

    Google Scholar 

  12. Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A.L.: Preserving statistical validity in adaptive data analysis. In: ACM on Symposium on Theory of Computing (STOC 2015), pp. 117–126. ACM (2015)

    Google Scholar 

  13. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 486–503. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  14. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Ebadi, H., Sands, D., Schneider, G.: Differential privacy: now it’s getting personal. In: ACM Symposium on Principles of Programming Languages, pp. 69–81. ACM (2015)

    Google Scholar 

  16. Fienberg, S.E., Rinaldo, A., Yang, X.: Differential privacy and the risk-utility tradeoff for multi-dimensional contingency tables. In: Domingo-Ferrer, J., Magkos, E. (eds.) PSD 2010. LNCS, vol. 6344, pp. 187–199. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  17. Hall, R., Rinaldo, A., Wasserman, L.: Random differential privacy. arXiv preprint arXiv:1112.2680 (2011)

  18. Hardt, M., Ullman, J.: Preventing false discovery in interactive data analysis is hard. In: IEEE Symposium on Foundations of Computer Science (FOCS 2014), pp. 454–463. IEEE (2014)

    Google Scholar 

  19. Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E.S., Spicer, K., De Wolf, P.P.: Statistical Disclosure Control. Wiley (2012)

    Google Scholar 

  20. Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106(4), 620 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  21. Kearns, M., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput. 11(6), 1427–1453 (1999)

    Article  Google Scholar 

  22. Liu, Z., Wang, Y.X., Smola, A.: Fast differentially private matrix factorization. In: ACM Conference on Recommender Systems (RecSys 2015), pp. 171–178. ACM (2015)

    Google Scholar 

  23. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: IEEE Symposium on Foundations of Computer Science (FOCS 2007), pp. 94–103 (2007)

    Google Scholar 

  24. Mir, D.J.: Information-theoretic foundations of differential privacy. In: Garcia-Alfaro, J., Cuppens, F., Cuppens-Boulahia, N., Miri, A., Tawbi, N. (eds.) FPS 2012. LNCS, vol. 7743, pp. 374–381. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  25. Mosteller, F., Tukey, J.W.: Data analysis, including statistics (1968)

    Google Scholar 

  26. Mukherjee, S., Niyogi, P., Poggio, T., Rifkin, R.: Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv. Comput. Math. 25(1–3), 161–193 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  27. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: IEEE Symposium on Security and Privacy, pp. 111–125. IEEE, September 2008

    Google Scholar 

  28. Russo, D., Zou, J.: Controlling bias in adaptive data analysis using information theory. In: International Conference on Artificial Intelligence and Statistics (AISTATS 2016) (2016)

    Google Scholar 

  29. Shalev-Shwartz, S., Shamir, O., Srebro, N., Sridharan, K.: Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11, 2635–2670 (2010)

    MathSciNet  MATH  Google Scholar 

  30. Steinke, T., Ullman, J.: Interactive fingerprinting codes and the hardness of preventing false discovery. arXiv preprint arXiv:1410.1228 (2014)

  31. Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. arXiv preprint arXiv:physics/0004057 (2000)

  32. Uhlerop, C., Slavković, A., Fienberg, S.E.: Privacy-preserving data sharing for genome-wide association studies. J. Priv. Confidentiality 5(1), 137 (2013)

    Google Scholar 

  33. Van Erven, T., Harremoës, P.: Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theor. 60(7), 3797–3820 (2014)

    Article  Google Scholar 

  34. Wang, Y.X., Fienberg, S.E., Smola, A.: Privacy for free: posterior sampling and stochastic gradient monte carlo. In: International Conference on Machine Learning (ICML 2015) (2015)

    Google Scholar 

  35. Wang, Y.X., Lei, J., Fienberg, S.E.: Learning with differential privacy: stability, learnability and the sufficiency and necessity of erm principle. J. Mach. Learn. Res. (to appear, 2016)

    Google Scholar 

  36. Wang, Y.X., Lei, J., Fienberg, S.E.: On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms (2016). preprint http://www.cs.cmu.edu/~yuxiangw/publications.html

  37. Yau, N.: Lessons from improperly anonymized taxi logs (2014). http://flowingdata.com/2014/06/23/lessons-from-improperly-anonymized-taxi-logs/

  38. Yu, F., Fienberg, S.E., Slavković, A.B., Uhler, C.: Scalable privacy-preserving data sharing methodology for genome-wide association studies. J. Biomed. Inform. 50, 133–141 (2014)

    Article  Google Scholar 

  39. Zhou, S., Lafferty, J., Wasserman, L.: Compressed and privacy-sensitive sparse regression. IEEE Trans. Inf. Theor. 55(2), 846–866 (2009)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Xiang Wang .

Editor information

Editors and Affiliations

A Proofs

A Proofs

We now prove Theorem 4 and Lemma 14. Due to space limit, proof of Lemmas 6, 7 and 11 are given in the technical report [36].

Proof of Theorem 4. We start by a ghost sample argument.

$$\begin{aligned}&\ \mathbb {E}_{z\sim \mathcal {D}, Z\sim \mathcal {D}^n} \mathbb {E}_{h\sim \mathcal {A}(Z)}\left[ \ell (z,h) - \frac{1}{n}\sum _{i=1}^n \ell (z_i,h)\right] \\ =&\ \mathbb {E}_{Z'\sim \mathcal {D}^n, Z\sim \mathcal {D}^n} \mathbb {E}_{h\sim \mathcal {A}(Z)}\left[ \frac{1}{n}\sum _{i=1}^n\ell (z_i',h) - \frac{1}{n}\sum _{i=1}^n \ell (z_i,h)\right] \\ =&\ \frac{1}{n}\sum _{i=1}^n \mathbb {E}_{z_i'\sim \mathcal {D}, Z\sim \mathcal {D}^n} \mathbb {E}_{h\sim \mathcal {A}(Z)}\left[ \ell (z_i',h) - \ell (z_i,h)\right] \\ =&\ \frac{1}{n}\sum _{i=1}^n \mathbb {E}_{z_i'\sim \mathcal {D}, Z\sim \mathcal {D}^n} \mathbb {E}_{h\sim \mathcal {A}(Z)}\left[ \ell (z_i',h) + \sum _{j\ne i}\ell (z_j,h) +r(h) - \ell (z_i,h) - \sum _{j\ne i}\ell (z_j,h) -r(h)\right] \\ =&\ \frac{1}{n}\sum _{i=1}^n \mathbb {E}_{z_i'\sim \mathcal {D}, Z\sim \mathcal {D}^n} \mathbb {E}_{\mathcal {A}(Z)}\left[ - \log p_{\mathcal {A}([Z_{-i},z'_i])}(h)+\log p_{\mathcal {A}(Z)(h)}(h) + \log K_{i}- \log K'_{i}\right] \\ =&\ \frac{1}{n}\sum _{i=1}^n \mathbb {E}_{z_i'\sim \mathcal {D}, Z\sim \mathcal {D}^n} \mathbb {E}_{\mathcal {A}(Z)}\left[ \log p_{\mathcal {A}(Z)}(h) - \log p_{\mathcal {A}([Z_{-i},z'_i])}(h) \right] \\ =&\ \mathbb {E}_{z\sim \mathcal {D}, Z\sim \mathcal {D}^n} \mathbb {E}_{h\sim \mathcal {A}(Z)}\left[ \log p_{\mathcal {A}(Z)}(h) - \log p_{ \mathcal {A}([Z_{-1},z])}(h) \right] . \end{aligned}$$

The \(K_i\) and \(K_i'\) are partition functions of \(p_{\mathcal {A}(Z)}(h)\) and \(p_{\mathcal {A}([Z_{-i},z_i'])}(h)\) respectively. Since \(z_i\sim z_i'\), we know \( \mathbb {E}K_i - \mathbb {E}K_i' = 0.\) The proof is complete by noting the non-negativity of On-Average KL-Privacy, which allows us to take absolute value without changing the equivalence.    \(\square \)

Proof of Lemma 14. Denote \(p(\mathcal {A}(Z)) = p(h|Z)\). \(p(h,Z) = p(h|Z) p(Z)\). The marginal distribution of h is therefore \(p(h)=\int _Z p(h,Z) dZ = \mathbb {E}_{Z} p(\mathcal {A}(Z)) \). By definition,

$$\begin{aligned}&\ I(\mathcal {A}(Z);Z) = \mathbb {E}_{Z}\mathbb {E}_{h|Z} \log \frac{p(h|Z) p(Z)}{p(h)p(Z)} \\ =&\ \mathbb {E}_{Z}\mathbb {E}_{h|Z} \log p(h|Z) - \mathbb {E}_{Z}\mathbb {E}_{h|Z}\log \mathbb {E}_{Z'} p(h|Z')\\ =&\ \mathbb {E}_{Z}\mathbb {E}_{h|Z} \log p(h|Z) - \mathbb {E}_{Z,Z'}\mathbb {E}_{h|Z} \log p(h|Z')\\&+\mathbb {E}_{Z,Z'}\mathbb {E}_{h|Z} \log p(h|Z')- \mathbb {E}_{Z}\mathbb {E}_{h|Z}\log \mathbb {E}_{Z'} p(h|Z')\\ =&\ \mathbb {E}_{Z,Z'}\mathbb {E}_{h|Z} \log \frac{p(h|Z)}{p(h|Z')}+\mathbb {E}_{Z,Z'}\mathbb {E}_{h|Z} \log p(h|Z')- \mathbb {E}_{Z}\mathbb {E}_{h|Z}\log \mathbb {E}_{Z'} p(h|Z')\\ =&\ D_{\mathrm {KL}}(\mathcal {A}(Z),\mathcal {A}(Z')) + \mathbb {E}_{Z}\mathbb {E}_{h|Z}\left[ \mathbb {E}_{Z'}\log p(\mathcal {A}(Z')) - \log \mathbb {E}_{Z'}p(\mathcal {A}(Z')) \right] \end{aligned}$$

The inequality remark in the last line follows from Jensen’s inequality.    \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, YX., Lei, J., Fienberg, S.E. (2016). On-Average KL-Privacy and Its Equivalence to Generalization for Max-Entropy Mechanisms. In: Domingo-Ferrer, J., Pejić-Bach, M. (eds) Privacy in Statistical Databases. PSD 2016. Lecture Notes in Computer Science(), vol 9867. Springer, Cham. https://doi.org/10.1007/978-3-319-45381-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45381-1_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45380-4

  • Online ISBN: 978-3-319-45381-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics