Privacy and the Dimensionality Curse

  • Charu C. Aggarwal
Part of the Advances in Database Systems book series (ADBS, volume 34)

Most privacy-transformation methods such as k-anonymity or randomization use some kind of transformation on the data for privacy-preservation purposes. In many cases, the data can be indirectly identified with the use of a combination of attributes. Such attributes may be available from public records and they may be used to link the sensitive records to the target of interest. Thus, the sensitive attributes of the record may be inferred as well with the use of publicly available attributes. In many cases, the target of interest may be known to the adversary, which results in a large number of combinations of attributes being known to the adversary. This is a reasonable assumption, since privacy attacks will often be mounted by an adversary with some knowledge of the target. As a result, the number of attributes for identification increases, and results in almost unique identification of the target. In this paper, we will examine a number of privacypreservation methods and show that in each case the privacy-preservation approach becomes either ineffective or infeasible.

Keywords

High dimensional privacy dimensionality curse for privacy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal R., Srikant R. Privacy-Preserving Data Mining. Proceedings of the ACM SIGMOD Conference, 2000.Google Scholar
  2. 2.
    Agrawal D. Aggarwal C. C. On the Design and Quantification of Privacy-Preserving Data Mining Algorithms. ACM PODS Conference, 2002.Google Scholar
  3. 3.
    Aggarwal C. C. On k-anonymity and the curse of dimensionality. VLDB Conference, 2005.Google Scholar
  4. 4.
    Aggarwal C. C., Yu P. S.: A Condensation approach to privacy preserving data mining. EDBT Conference, 2004.Google Scholar
  5. 5.
    Aggarwal C. C.: On Randomization, Public Information and the Curse of Dimensionality. ICDE Conference, 2007.Google Scholar
  6. 6.
    Aggarwal C. C., Yu P. S.: On Privacy-Preservation of Text and Sparse Binary Data with Sketches. SIAM Conference on Data Mining, 2007.Google Scholar
  7. 7.
    Bayardo R.J., Agrawal R.: Data Privacy through Optimal k-Anonymization. Proceedings of the ICDE Conference, pp. 217–228, 2005.Google Scholar
  8. 8.
    Hinneburg A., Aggarwal C.. Keim D.: What is the nearest neighbor in high dimensional spaces? VLDB Conference, 2000.Google Scholar
  9. 9.
    Huang Z., Du W., Chen B.: Deriving Private Information from Randomized Data. pp. 37–48, ACM SIGMOD Conference, 2005.Google Scholar
  10. 10.
    Kargupta H., Datta S., Wang Q., Sivakumar K.: On the Privacy Preserving Properties of Random Data Perturbation Techniques. ICDM Conference, pp. 99–106, 2003.Google Scholar
  11. 11.
    Machanavajjhala A., Gehrke J., Kifer D., and Venkitasubramaniam M.: l-Diversity: Privacy Beyond k-Anonymity. ICDE, 2006.Google Scholar
  12. 12.
    Martin D., Kifer D., Machanavajjhala A., Gehrke J., Halpern J.: Worst-Case Background Knowledeg. ICDE Conference, 2007.Google Scholar
  13. 13.
    Meyerson A., Williams R. On the complexity of optimal k-anonymity. ACM PODS Conference, 2004.Google Scholar
  14. 14.
    Samarati P.: Protecting Respondents’ Identities in Microdata Release. IEEE Trans. Knowl. Data Eng. 13(6): 1010–1027, 2001.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Charu C. Aggarwal
    • 1
  1. 1.Department of Computer Information SystemsIBM Thomas J. Watson Research CenterHawthorneUSA

Personalised recommendations