Bounds on the Sample Complexity for Private Learning and Private Data Release

Beimel, Amos; Kasiviswanathan, Shiva Prasad; Nissim, Kobbi

doi:10.1007/978-3-642-11799-2_26

Bounds on the Sample Complexity for Private Learning and Private Data Release

Amos Beimel¹⁷,
Shiva Prasad Kasiviswanathan¹⁸ &
Kobbi Nissim^17,19

Conference paper

2171 Accesses
40 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 5978))

Abstract

Learning is a task that generalizes many of the analyses that are applied to collections of data, and in particular, collections of sensitive individual information. Hence, it is natural to ask what can be learned while preserving individual privacy. [Kasiviswanathan, Lee, Nissim, Raskhodnikova, and Smith; FOCS 2008] initiated such a discussion. They formalized the notion of private learning, as a combination of PAC learning and differential privacy, and investigated what concept classes can be learned privately. Somewhat surprisingly, they showed that, ignoring time complexity, every PAC learning task could be performed privately with polynomially many samples, and in many natural cases this could even be done in polynomial time.

While these results seem to equate non-private and private learning, there is still a significant gap: the sample complexity of (non-private) PAC learning is crisply characterized in terms of the VC-dimension of the concept class, whereas this relationship is lost in the constructions of private learners, which exhibit, generally, a higher sample complexity.

Looking into this gap, we examine several private learning tasks and give tight bounds on their sample complexity. In particular, we show strong separations between sample complexities of proper and improper private learners (such separation does not exist for non-private learners), and between sample complexities of efficient and inefficient proper private learners. Our results show that VC-dimension is not the right measure for characterizing the sample complexity of proper private learning.

We also examine the task of private data release (as initiated by [Blum, Ligett, and Roth; STOC 2008]), and give new lower bounds on the sample complexity. Our results show that the logarithmic dependence on size of the instance space is essential for private data release.

The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-3-642-11799-2_36

Download to read the full chapter text

Chapter PDF

References

Beimel, A., Kasiviswanathan, S., Nissim, K.: Bounds on the Sample Complexity for Private Learning and Private Data Release (Full version) (2009)
Google Scholar
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: The SuLQ framework. In: PODS, pp. 128–138. ACM, New York (2005)
Google Scholar
Blum, A., Ligett, K., Roth, A.: A learning theory approach to non-interactive database privacy. In: STOC, pp. 609–618. ACM, New York (2008)
Google Scholar
Blum, A., Ligett, K., Roth, A.: Private communication (2008)
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. Journal of the Association for Computing Machinery 36(4), 929–965 (1989)
Article MathSciNet MATH Google Scholar
Dwork, C.: The differential privacy frontier (extended abstract). In: Reingold, O. (ed.) TCC 2009. LNCS, vol. 5444, pp. 496–502. Springer, Heidelberg (2009)
Google Scholar
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)
Chapter Google Scholar
Dwork, C., Naor, M., Reingold, O., Rothblum, G., Vadhan, S.: On the complexity of differentially private data release. In: STOC, pp. 381–390. ACM, New York (2009)
Google Scholar
Ehrenfeucht, A., Haussler, D., Kearns, M.J., Valiant, L.G.: A general lower bound on the number of examples needed for learning. Inf. Comput. 82(3), 247–261 (1989)
Article MathSciNet MATH Google Scholar
Kasiviswanathan, S.P., Lee, H.K., Nissim, K., Raskhodnikova, S., Smith, A.: What can we learn privately? In: FOCS, pp. 531–540. IEEE Computer Society, Los Alamitos (2008)
Google Scholar
Kasiviswanathan, S.P., Smith, A.: A note on differential privacy: Defining resistance to arbitrary side information. CoRR, arXiv:0803.39461 [cs.CR] (2008)
Google Scholar
Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. Journal of the ACM 45(6), 983–1006 (1998); Preliminary version in Proceedings of STOC 1993
Article MathSciNet MATH Google Scholar
Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)
Google Scholar
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: FOCS, pp. 94–103. IEEE, Los Alamitos (2007)
Google Scholar
Mishra, N., Sandler, M.: Privacy via pseudorandom sketches. In: PODS, pp. 143–152. ACM, New York (2006)
Google Scholar
Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. Journal of the ACM 35(4), 965–984 (1988)
Article MathSciNet MATH Google Scholar
Valiant, L.G.: A theory of the learnable. Communications of the ACM 27, 1134–1142 (1984)
Article MATH Google Scholar
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16, 264 (1971)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Ben-Gurion University, Israel
Amos Beimel & Kobbi Nissim
CCS-3, Los Alamos National Laboratory, Israel
Shiva Prasad Kasiviswanathan
Microsoft Audience Intelligence, Israel
Kobbi Nissim

Authors

Amos Beimel
View author publications
You can also search for this author in PubMed Google Scholar
Shiva Prasad Kasiviswanathan
View author publications
You can also search for this author in PubMed Google Scholar
Kobbi Nissim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science & Engineering Department, University of California,, 9500 Gilman Drive, La Jolla, 92093-5004, San Diego, CA, USA
Daniele Micciancio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Beimel, A., Kasiviswanathan, S.P., Nissim, K. (2010). Bounds on the Sample Complexity for Private Learning and Private Data Release. In: Micciancio, D. (eds) Theory of Cryptography. TCC 2010. Lecture Notes in Computer Science, vol 5978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11799-2_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-11799-2_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11798-5
Online ISBN: 978-3-642-11799-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics