Bounds on the Sample Complexity for Private Learning and Private Data Release
 Amos Beimel,
 Shiva Prasad Kasiviswanathan,
 Kobbi Nissim
 … show all 3 hide
Abstract
Learning is a task that generalizes many of the analyses that are applied to collections of data, and in particular, collections of sensitive individual information. Hence, it is natural to ask what can be learned while preserving individual privacy. [Kasiviswanathan, Lee, Nissim, Raskhodnikova, and Smith; FOCS 2008] initiated such a discussion. They formalized the notion of private learning, as a combination of PAC learning and differential privacy, and investigated what concept classes can be learned privately. Somewhat surprisingly, they showed that, ignoring time complexity, every PAC learning task could be performed privately with polynomially many samples, and in many natural cases this could even be done in polynomial time.
While these results seem to equate nonprivate and private learning, there is still a significant gap: the sample complexity of (nonprivate) PAC learning is crisply characterized in terms of the VCdimension of the concept class, whereas this relationship is lost in the constructions of private learners, which exhibit, generally, a higher sample complexity.
Looking into this gap, we examine several private learning tasks and give tight bounds on their sample complexity. In particular, we show strong separations between sample complexities of proper and improper private learners (such separation does not exist for nonprivate learners), and between sample complexities of efficient and inefficient proper private learners. Our results show that VCdimension is not the right measure for characterizing the sample complexity of proper private learning.
We also examine the task of private data release (as initiated by [Blum, Ligett, and Roth; STOC 2008]), and give new lower bounds on the sample complexity. Our results show that the logarithmic dependence on size of the instance space is essential for private data release.
 Beimel, A., Kasiviswanathan, S., Nissim, K.: Bounds on the Sample Complexity for Private Learning and Private Data Release (Full version) (2009)
 Blum, A., Dwork, C., McSherry, F., Nissim, K. (2005) Practical privacy: The SuLQ framework. PODS. ACM, New York, pp. 128138
 Blum, A., Ligett, K., Roth, A. (2008) A learning theory approach to noninteractive database privacy. STOC. ACM, New York, pp. 609618
 Blum, A., Ligett, K., Roth, A.: Private communication (2008)
 Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K. (1989) Learnability and the VapnikChervonenkis dimension. Journal of the Association for Computing Machinery 36: pp. 929965
 Dwork, C. The differential privacy frontier (extended abstract). In: Reingold, O. eds. (2009) TCC 2009. Springer, Heidelberg, pp. 496502
 Dwork, C., McSherry, F., Nissim, K., Smith, A. Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. eds. (2006) Theory of Cryptography. Springer, Heidelberg, pp. 265284 CrossRef
 Dwork, C., Naor, M., Reingold, O., Rothblum, G., Vadhan, S. (2009) On the complexity of differentially private data release. STOC. ACM, New York, pp. 381390
 Ehrenfeucht, A., Haussler, D., Kearns, M.J., Valiant, L.G. (1989) A general lower bound on the number of examples needed for learning. Inf. Comput. 82: pp. 247261 CrossRef
 Kasiviswanathan, S.P., Lee, H.K., Nissim, K., Raskhodnikova, S., Smith, A. (2008) What can we learn privately?. FOCS. IEEE Computer Society, Los Alamitos, pp. 531540
 Kasiviswanathan, S.P., Smith, A.: A note on differential privacy: Defining resistance to arbitrary side information. CoRR, arXiv:0803.39461 [cs.CR] (2008)
 Kearns, M.J. (1998) Efficient noisetolerant learning from statistical queries. Journal of the ACM 45: pp. 9831006 CrossRef
 Kearns, M.J., Vazirani, U.V. (1994) An Introduction to Computational Learning Theory. MIT Press, Cambridge
 McSherry, F., Talwar, K. (2007) Mechanism design via differential privacy. FOCS. IEEE, Los Alamitos, pp. 94103
 Mishra, N., Sandler, M. (2006) Privacy via pseudorandom sketches. PODS. ACM, New York, pp. 143152
 Pitt, L., Valiant, L.G. (1988) Computational limitations on learning from examples. Journal of the ACM 35: pp. 965984 CrossRef
 Valiant, L.G. (1984) A theory of the learnable. Communications of the ACM 27: pp. 11341142 CrossRef
 Vapnik, V.N., Chervonenkis, A.Y. (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16: pp. 264 CrossRef
 Title
 Bounds on the Sample Complexity for Private Learning and Private Data Release
 Book Title
 Theory of Cryptography
 Book Subtitle
 7th Theory of Cryptography Conference, TCC 2010, Zurich, Switzerland, February 911, 2010. Proceedings
 Pages
 pp 437454
 Copyright
 2010
 DOI
 10.1007/9783642117992_26
 Print ISBN
 9783642117985
 Online ISBN
 9783642117992
 Series Title
 Lecture Notes in Computer Science
 Series Volume
 5978
 Series ISSN
 03029743
 Publisher
 Springer Berlin Heidelberg
 Copyright Holder
 Springer Berlin Heidelberg
 Additional Links
 Topics
 Industry Sectors
 eBook Packages
 Editors

 Daniele Micciancio ^{(16)}
 Editor Affiliations

 16. Computer Science & Engineering Department, University of California,
 Authors

 Amos Beimel ^{(17)}
 Shiva Prasad Kasiviswanathan ^{(18)}
 Kobbi Nissim ^{(17)} ^{(19)}
 Author Affiliations

 17. Dept. of Computer Science, BenGurion University,
 18. CCS3, Los Alamos National Laboratory,
 19. Microsoft Audience Intelligence,
Continue reading...
To view the rest of this content please follow the download PDF link above.