Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Randomization Methods to Ensure Data Privacy

  • Ashwin Machanavajjhala
  • Johannes Gehrke
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_301

Synonyms

Perturbation techniques

Definition

Many organizations, e.g., government statistical offices and search engine companies, collect potentially sensitive information regarding individuals either to publish this data for research, or in return for useful services. While some data collection organizations, like the census, are legally required not to breach the privacy of the individuals, other data collection organizations may not be trusted to uphold privacy. Hence, if U denotes the original data containing sensitive information about a set of individuals, then an untrusted data collector or researcher should only have access to an anonymized version of the data, U*, that does not disclose the sensitive information about the individuals. A randomized anonymization algorithm R is said to be a privacy preserving randomization method if for every table T, and for every output T * = R(T), the privacy of all the sensitive information of each individual in the original data is...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Adam NR, Wortmann JC. Security-control methods for statistical databases: a comparative study. ACM Comput Surv. 1989;21(4):515–56.CrossRefGoogle Scholar
  2. 2.
    Agrawal R, Srikant R. Privacy preserving data mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 439–50.Google Scholar
  3. 3.
    Agrawal S, Haritsa JR. A framework for high-accuracy privacy-preserving mining. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 193-204.Google Scholar
  4. 4.
    Barak B, Chaudhuri K, Dwork C, Kale S, McSherry F, Talwar K. Privacy, accuracy and consistency too: a holistic solution to contingency table release. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2007.Google Scholar
  5. 5.
    Blum A, Dwork C, McSherry F, Nissim K. Practical privacy: the SuLQ framework. In: Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2005. p. 128–38.Google Scholar
  6. 6.
    Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference; 2006. p. 265–84.CrossRefGoogle Scholar
  7. 7.
    Evfimievski A, Gehrke J, Srikant R. Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2003. p. 211–22.Google Scholar
  8. 8.
    Evfimievsky A, Srikant R, Gehrke J, Agrawal R. Privacy preserving data mining of association rules. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002. p. 217–28.Google Scholar
  9. 9.
    Huang Z, Du W, Chen B. Deriving private information from randomized data. In: Proceedings of the 23th ACM SIGMOD Conference on Management of Data; 2004.Google Scholar
  10. 10.
    Kargupta H, Datta S, Wang Q, Sivakumar K. On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 2003 IEEE International Conference on Data Mining; 2003. p. 99–106.Google Scholar
  11. 11.
    Kifer D, Gehrke J. Injecting utility into anonymized datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006.Google Scholar
  12. 12.
    Machanavajjhala A, Kifer D, Abowd J, Gehrke J, Vihuber L. Privacy: from theory to practice on the map. In: Proceedings of the 24th International Conference on Data Engineering; 2008.Google Scholar
  13. 13.
    On The Map (Version 2) http://lehdmap2.dsd.census.gov/.
  14. 14.
    Rastogi V, Suciu D, Hong S. The boundary between privacy and utility in data publishing. Tech. rep., University of Washington; 2007.Google Scholar
  15. 15.
    Reiter J. Estimating risks of identification disclosure for microdata. J Am Stat Assoc. 2005;100(472):1103–13.Google Scholar
  16. 16.
    Rubin DB. Discussion statistical disclosure limitation. J Off Stat. 1993;9(2):461–8.Google Scholar
  17. 17.
    Warner SL. Randomized response: a survey technique for eliminating evasive answer bias. J Am Stat Assoc. 1965;60(309):63–9.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Cornell UniversityIthacaUSA

Section editors and affiliations

  • Chris Clifton
    • 1
  1. 1.Dept. of Computer SciencePurdue UniversityWest LafayetteUSA