Calibrating Noise to Sensitivity in Private Data Analysis

  • Cynthia Dwork
  • Frank McSherry
  • Kobbi Nissim
  • Adam Smith
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3876)

Abstract

We continue a line of research initiated in [10,11]on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function f mapping databases to reals, the so-called true answer is the result of applying f to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user.

Previous work focused on the case of noisy sums, in which f = ∑ig(xi), where xi denotes the ith row of the database and g maps database rows to [0,1]. We extend the study to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. Roughly speaking, this is the amount that any single argument to f can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case.

The first step is a very clean characterization of privacy in terms of indistinguishability of transcripts. Additionally, we obtain separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

References

  1. 1.
    Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Computing Surveys 25(4) (December 1989)Google Scholar
  2. 2.
    Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York (2001)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) SIGMOD Conference, pp. 439–450. ACM, New York (2000)Google Scholar
  4. 4.
    Ben-Sasson, E., Harsha, P., Raskhodnikova, S.: Some 3cnf properties are hard to test. In: STOC, pp. 345–354. ACM, New York (2000)Google Scholar
  5. 5.
    Web page for the Bertinoro CS-Statistics workshop on privacy and confidentiality (July 2005), Available from, http://www.stat.cmu.edu/~hwainer
  6. 6.
    Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: The sulq framework. In: PODS (2005)Google Scholar
  7. 7.
    Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Theory of Cryptography Conference (TCC), pp. 363–385 (2005)Google Scholar
  8. 8.
    Chawla, S., Dwork, C., McSherry, F., Talwar, K.: On the utility of privacy-preserving histograms. In: 21st Conference on Uncertainty in Artificial Intelligence (UAI) (2005)Google Scholar
  9. 9.
    Denning, D.E.: Secure statistical databases with random sample queries. ACM Transactions on Database Systems 5(3), 291–315 (September 1980)CrossRefMATHGoogle Scholar
  10. 10.
    Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 202–210 (2003)Google Scholar
  11. 11.
    Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    Evfimievski, A.V., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the Twenty- Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 211–222 (2003)Google Scholar
  13. 13.
    Goldwasser, S., Micali, S.: Probabilistic encryption. Journal of Computer and System Sciences 28(2), 270–299 (1984)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Roque, G.: Masking microdata with mixtures of normal distributions. University of California, Riverside (2000); Doctoral DissertationGoogle Scholar
  15. 15.
    Sweeney, L.: k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cynthia Dwork
    • 1
  • Frank McSherry
    • 1
  • Kobbi Nissim
    • 2
  • Adam Smith
    • 3
  1. 1.Microsoft Research
  2. 2.Ben-Gurion UniversityIsrael
  3. 3.Weizmann Institute of ScienceIsrael

Personalised recommendations