Skip to main content
Log in

Learning Privately with Labeled and Unlabeled Examples

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

A private learner is an algorithm that given a sample of labeled individual examples outputs a generalizing hypothesis while preserving the privacy of each individual. In 2008, Kasiviswanathan et al. (FOCS 2008) gave a generic construction of private learners, in which the sample complexity is (generally) higher than what is needed for non-private learners. This gap in the sample complexity was then further studied in several followup papers, showing that (at least in some cases) this gap is unavoidable. Moreover, those papers considered ways to overcome the gap, by relaxing either the privacy or the learning guarantees of the learner. We suggest an alternative approach, inspired by the (non-private) models of semi-supervised learning and active-learning, where the focus is on the sample complexity of labeled examples whereas unlabeled examples are of a significantly lower cost. We consider private semi-supervised learners that operate on a random sample, where only a (hopefully small) portion of this sample is labeled. The learners have no control over which of the sample elements are labeled. Our main result is that the labeled sample complexity of private learners is characterized by the VC dimension. We present two generic constructions of private semi-supervised learners. The first construction is of learners where the labeled sample complexity is proportional to the VC dimension of the concept class, however, the unlabeled sample complexity of the algorithm is as big as the representation length of domain elements. Our second construction presents a new technique for decreasing the labeled sample complexity of a given private learner, while roughly maintaining its unlabeled sample complexity. In addition, we show that in some settings the labeled sample complexity does not depend on the privacy parameters of the learner.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. To simplify the exposition, we omit in this section dependency on all variables except for d, corresponding to the representation length of domain elements.

  2. A semi-supervised learner uses a small batch of labeled examples and a large batch of unlabeled examples, whereas an active-learner operates on a large batch of unlabeled example and chooses (maybe adaptively) a subset of the examples and gets their labels.

  3. We remark that—unlike this work—the focus in [5] is on the dependency of the labeled sample complexity in the approximation parameter. As our learners are non-active, their labeled sample complexity is lower bounded by \(\varOmega (\frac{1}{\alpha })\) (where \(\alpha\) is the approximation parameter).

  4. Combining the technique of [10] and the recent result of [31], the unlabeled sample complexity can be reduced to \({\widetilde{O}}(\ell ^3 \cdot (\log ^*d)^{1.5})\).

  5. Feldman and Xiao [27] showed an example of a concept class C over \(X_d\) for which every pure-private learner must have unlabeled sample complexity \(\varOmega ({\mathrm{VC}}(C)\cdot d)\). Hence, as a function of d and \({\mathrm{VC}}(C)\), the unlabeled sample complexity in Theorem 3.3 is the best possible for a generic construction of pure-private learners.

  6. These works present sanitizers for \(\mathtt{THRESH}_d\), but any sanitizer for \(\mathtt{THRESH}_d\) can easily be transformed into a sanitizer for \(\mathtt{INTERVAL}_d\).

References

  1. Abadi, M., Chu, A., Goodfellow, I.J., McMahan, H.B., Mironov, I., Talwar, K., Zhang, L.: Deep learning with differential privacy. In: Weippl, E.R., Katzenbeisser, S., Kruegel, C., Myers, A.C., Halevi, S. (eds.) Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24–28, 2016, pp. 308–318. ACM (2016). https://doi.org/10.1145/2976749.2978318

  2. Agrawala, A.: Learning with a probabilistic teacher. IEEE Trans. Inf. Theory 16(4), 373–379 (1970). https://doi.org/10.1109/TIT.1970.1054472

    Article  MathSciNet  MATH  Google Scholar 

  3. Alon, N., Beimel, A., Moran, S., Stemmer, U.: Closure properties for private classification and online prediction. In: COLT (2020)

  4. Alon, N., Livni, R., Malliaris, M., Moran, S.: Private PAC learning implies finite littlestone dimension. In: Charikar, M., Cohen, E. (eds.) Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23–26, 2019, pp. 852–860. ACM (2019). https://doi.org/10.1145/3313276.3316312

  5. Balcan, M.F., Feldman, V.: Statistical active learning algorithms. In: Advances in Neural Information Processing Systems, vol. 26, pp. 1295–1303 (2013)

  6. Bassily, R., Feldman, V., Talwar, K., Thakurta, A.G.: Private stochastic convex optimization with optimal rates. In: Wallach, H.M. Larochelle, H. Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp. 11279–11288 (2019). http://papers.nips.cc/paper/9306-private-stochastic-convex-optimization-with-optimal-rates

  7. Bassily, R., Nissim, K., Smith, A.D., Steinke, T., Stemmer, U., Ullman, J.: Algorithmic stability for adaptive data analysis. In: Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18–21, 2016, pp. 1046–1059 (2016). https://doi.org/10.1145/2897518.2897566

  8. Beimel, A., Brenner, H., Kasiviswanathan, S.P., Nissim, K.: Bounds on the sample complexity for private learning and private data release. Mach. Learn. 94(3), 401–437 (2014)

    Article  MathSciNet  Google Scholar 

  9. Beimel, A., Nissim, K., Stemmer, U.: Characterizing the sample complexity of private learners. In: ITCS, pp. 97–110. ACM (2013)

  10. Beimel, A., Nissim, K., Stemmer, U.: Private learning and sanitization: pure vs. approximate differential privacy. Theory Comput. 12(1), 1–61 (2016). https://doi.org/10.4086/toc.2016.v012a001

    Article  MathSciNet  MATH  Google Scholar 

  11. Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: Li, C. (ed.) PODS, pp. 128–138. ACM (2005)

  12. Blum, A., Ligett, K., Roth, A.: A learning theory approach to noninteractive database privacy. J. ACM 60(2), 12 (2013)

    Article  MathSciNet  Google Scholar 

  13. Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36(4), 929–965 (1989)

    Article  MathSciNet  Google Scholar 

  14. Bun, M., Livni, R., Moran, S.: An equivalence between private classification and online prediction. CoRR arxiv:abs/2003.00563 (2020)

  15. Bun, M., Nissim, K., Stemmer, U.: Simultaneous private learning of multiple concepts. In: ITCS, pp. 369–380. ACM (2016)

  16. Bun, M., Nissim, K., Stemmer, U., Vadhan, S.P.: Differentially private release and learning of threshold functions. In: FOCS, pp. 634–649 (2015)

  17. Bun, M., Ullman, J., Vadhan, S.P.: Fingerprinting codes and the price of approximate differential privacy. In: Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31–June 03, 2014, pp. 1–10 (2014). https://doi.org/10.1145/2591796.2591877

  18. Chaudhuri, K., Hsu, D.: Sample complexity bounds for differentially private learning. In: Kakade, S.M., von Luxburg, U. (eds.) COLT, JMLR Proceedings, vol. 19, pp. 155–186. JMLR.org (2011)

  19. Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) NIPS. MIT Press (2008)

  20. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011). http://dl.acm.org/citation.cfm?id=1953048.2021036

  21. Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A.L.: Preserving statistical validity in adaptive data analysis. In: STOC, pp. 117–126. ACM (2015)

  22. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudenay, S. (ed.) EUROCRYPT, Lecture Notes in Computer Science, vol. 4004, pp. 486–503. Springer (2006)

  23. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: TCC, Lecture Notes in Computer Science, vol. 3876, pp. 265–284. Springer (2006)

  24. Dwork, C., Rothblum, G.N., Vadhan, S.P.: Boosting and differential privacy. In: FOCS, pp. 51–60. IEEE Computer Society (2010)

  25. Ehrenfeucht, A., Haussler, D., Kearns, M.J., Valiant, L.G.: A general lower bound on the number of examples needed for learning. Inf. Comput. 82(3), 247–261 (1989)

    Article  MathSciNet  Google Scholar 

  26. Feldman, V., Koren, T., Talwar, K.: Private stochastic convex optimization: optimal rates in linear time. In: Makarychev, K., Makarychev, Y., Tulsiani, M., Kamath, G., Chuzhoy, J. (eds.) Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22–26, 2020, pp. 439–449. ACM (2020). https://doi.org/10.1145/3357713.3384335

  27. Feldman, V., Xiao, D.: Sample complexity bounds on differentially private learning via communication complexity. SIAM J. Comput. 44(6), 1740–1764 (2015). https://doi.org/10.1137/140991844

    Article  MathSciNet  MATH  Google Scholar 

  28. Fralick, S.: Learning to recognize patterns without a teacher. IEEE Trans. Inf. Theor. 13(1), 57–64 (2006). https://doi.org/10.1109/TIT.1967.1053952

    Article  Google Scholar 

  29. Gupta, A., Hardt, M., Roth, A., Ullman, J.: Privately releasing conjunctions and the statistical query barrier. In: Fortnow, L., Vadhan, S.P. (eds.) STOC, pp. 803–812. ACM (2011)

  30. Hardt, M., Ullman, J.: Preventing false discovery in interactive data analysis is hard. In: FOCS. IEEE (2014)

  31. Kaplan, H., Ligett, K., Mansour, Y., Naor, M., Stemmer, U.: Privately learning thresholds: closing the exponential gap. In: COLT (2020)

  32. Kasiviswanathan, S.P., Lee, H.K., Nissim, K., Raskhodnikova, S., Smith, A.: What can we learn privately? SIAM J. Comput. 40(3), 793–826 (2011)

    Article  MathSciNet  Google Scholar 

  33. Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. J. ACM 45(6), 983–1006 (1998)

    Article  MathSciNet  Google Scholar 

  34. McCallum, A., Nigam, K.: Employing em and pool-based active learning for text classification. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML’98, pp. 350–358. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998). http://dl.acm.org/citation.cfm?id=645527.757765

  35. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: FOCS, pp. 94–103. IEEE Computer Society (2007)

  36. Papernot, N., Abadi, M., Erlingsson, Ú., Goodfellow, I.J., Talwar, K.: Semi-supervised knowledge transfer for deep learning from private training data. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=HkwoSDPgg

  37. Rubinstein, B.I.P., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for svm learning. CoRR arXiv:abs/0911.5708 (2009)

  38. Sauer, N.: On the density of families of sets. J. Comb. Theory Ser. A 13(1), 145–147 (1972). https://doi.org/10.1016/0097-3165(72)90019-2

    Article  MathSciNet  MATH  Google Scholar 

  39. Scudder, H.: Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11(3), 363–371 (1965). https://doi.org/10.1109/TIT.1965.1053799

    Article  MathSciNet  MATH  Google Scholar 

  40. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984). https://doi.org/10.1145/1968.1972

    Article  MATH  Google Scholar 

  41. Vapnik, V., Chervonenkis, A.: Theory of Pattern Recognition [in Russian]. Nauka, Moscow (1974)

    MATH  Google Scholar 

  42. Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16(2), 264–280 (1971)

    Article  Google Scholar 

Download references

Acknowledgements

We thank Aryeh Kontorovich, Adam Smith, and Salil Vadhan for helpful discussions of ideas in this work. We thank the anonymous reviewers for their helpful comments and suggestions. Work of A. B. was supported in part by the Israeli Ministry of Science and Technology, by the Israel Science Foundation (Grants 544/13 and 152/17), by the Frankel Center for Computer Science, by ERC Grant 742754 (project NTSC), by the Cyber Security Research Center at Ben-Gurion University of the Negev, and by NSF Grant No. 1565387, TWC: Large: Collaborative: Computing Over Distributed Sensitive Data. Work of K. N. was done in part while the author was visiting in the Center for Research on Computation and Society, Harvard University, and was initially supported by the Israel Science Foundation (Grant 276/12) and by NSF Grant CNS-1237235 and later by NSF Grant No. 1565387 TWC: Large: Collaborative: Computing Over Distributed Sensitive Data. Work of U. S. was supported in part by the Israeli Ministry of Science and Technology, by the Check Point Institute for Information Security, by the IBM PhD Fellowship Awards Program, by the Frankel Center for Computer Science, by the Israel Science Foundation (Grant 1871/19), and by the Cyber Security Research Center at Ben-Gurion University of the Negev.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Uri Stemmer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appeared in SODA’15.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Beimel, A., Nissim, K. & Stemmer, U. Learning Privately with Labeled and Unlabeled Examples. Algorithmica 83, 177–215 (2021). https://doi.org/10.1007/s00453-020-00753-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-020-00753-z

Keywords

Navigation