Advertisement

Anonymizing Tables

  • Gagan Aggarwal
  • Tomás Feder
  • Krishnaram Kenthapadi
  • Rajeev Motwani
  • Rina Panigrahy
  • Dilys Thomas
  • An Zhu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3363)

Abstract

We consider the problem of releasing tables from a relational database containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is k-anonymization. A release is considered k-anonymous if the information for each person contained in the release cannot be distinguished from at least k–1 other persons whose information also appears in the release. In the k- Anonymity problem the objective is to minimally suppress cells in the table so as to ensure that the released version is k-anonymous. We show that the k-Anonymity problem is NP-hard even when the attribute values are ternary. On the positive side, we provide an O(k)-approximation algorithm for the problem. This improves upon the previous best-known O(klog k)-approximation. We also give improved positive results for the interesting cases with specific values of k — in particular, we give a 1.5-approximation algorithm for the special case of 2-Anonymity, and a 2-approximation algorithm for 3-Anonymity.

Keywords

Steiner Point Privacy Preserve Alphabet Size Binary Alphabet Candidate Vertex 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AA01]
    Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving datamining algorithms. In: Proc. of the ACM Symp. on Principles of Database Systems (2001)Google Scholar
  2. [AMP04]
    Aggarwal, G., Mishra, N., Pinkas, B.: Privacy preserving computation of the k-th ranked element. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. [AS00]
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 439–450 (May 2000)Google Scholar
  4. [AST03]
    Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving aggregates. Technical report, Stanford University (2003)Google Scholar
  5. [Cor88]
    Cornuejols, G.P.: General factors of graphs. Journal of Combinatorial Theory B 45, 185–198 (1988)zbMATHCrossRefMathSciNetGoogle Scholar
  6. [DN03]
    Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proc. of the ACM Symp. on Principles of Database Systems, pp. 202–210 (2003)Google Scholar
  7. [DN04]
    Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)Google Scholar
  8. [EGS03]
    Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proc. of the ACM Symp. on Principles of Database Systems (June 2003)Google Scholar
  9. [Eur98]
    European Union. Directive on Privacy Protection (October 1998)Google Scholar
  10. [FNP04]
    Freedman, M., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. [Kan94]
    Kann, V.: Maximum bounded H-matching is MAX SNP-complete. Information Processing Letters 49, 309–318 (1994)zbMATHCrossRefMathSciNetGoogle Scholar
  12. [LP00]
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. [MW04]
    Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. of the ACM Symp. on Principles of Database Systems (June 2004)Google Scholar
  14. [SS98]
    Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proc. of the ACM Symp. on Principles of Database Systems, pp. 188 (1998)Google Scholar
  15. [Swe00]
    Sweeney, L.: Uniqueness of simple demographics in the U.S. population. In: LIDAP-WP4. Carnegie Mellon University, Laboratory for International Data Privacy, Pittsburgh, PA (2000)Google Scholar
  16. [Swe02]
    Sweeney, L.: k-Anonymity: A model for protecting privacy. International Journal on Uncertainty Fuzziness Knowledge-based Systems (June 2002)Google Scholar
  17. [Tim97]
    Time. The Death of Privacy (August 1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Gagan Aggarwal
    • 1
  • Tomás Feder
    • 1
  • Krishnaram Kenthapadi
    • 1
  • Rajeev Motwani
    • 1
  • Rina Panigrahy
    • 1
  • Dilys Thomas
    • 1
  • An Zhu
    • 1
  1. 1.Stanford University 

Personalised recommendations