Anonymizing Tables

  • Gagan Aggarwal
  • Tomás Feder
  • Krishnaram Kenthapadi
  • Rajeev Motwani
  • Rina Panigrahy
  • Dilys Thomas
  • An Zhu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3363)

Abstract

We consider the problem of releasing tables from a relational database containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is k-anonymization. A release is considered k-anonymous if the information for each person contained in the release cannot be distinguished from at least k–1 other persons whose information also appears in the release. In the k-Anonymityproblem the objective is to minimally suppress cells in the table so as to ensure that the released version is k-anonymous. We show that the k-Anonymity problem is NP-hard even when the attribute values are ternary. On the positive side, we provide an O(k)-approximation algorithm for the problem. This improves upon the previous best-known O(klog k)-approximation. We also give improved positive results for the interesting cases with specific values of k — in particular, we give a 1.5-approximation algorithm for the special case of 2-Anonymity, and a 2-approximation algorithm for 3-Anonymity.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AA01]
    Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving datamining algorithms. In: Proc. of the ACM Symp. on Principles of Database Systems (2001)Google Scholar
  2. [AMP04]
    Aggarwal, G., Mishra, N., Pinkas, B.: Privacy preserving computation of the k-th ranked element. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. [AS00]
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 439–450 (May 2000)Google Scholar
  4. [AST03]
    Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving aggregates. Technical report, Stanford University (2003)Google Scholar
  5. [Cor88]
    Cornuejols, G.P.: General factors of graphs. Journal of Combinatorial Theory B 45, 185–198 (1988)MATHCrossRefMathSciNetGoogle Scholar
  6. [DN03]
    Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proc. of the ACM Symp. on Principles of Database Systems, pp. 202–210 (2003)Google Scholar
  7. [DN04]
    Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)Google Scholar
  8. [EGS03]
    Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proc. of the ACM Symp. on Principles of Database Systems (June 2003)Google Scholar
  9. [Eur98]
    European Union. Directive on Privacy Protection (October 1998)Google Scholar
  10. [FNP04]
    Freedman, M., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. [Kan94]
    Kann, V.: Maximum bounded H-matching is MAX SNP-complete. Information Processing Letters 49, 309–318 (1994)MATHCrossRefMathSciNetGoogle Scholar
  12. [LP00]
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. [MW04]
    Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. of the ACM Symp. on Principles of Database Systems (June 2004)Google Scholar
  14. [SS98]
    Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proc. of the ACM Symp. on Principles of Database Systems, pp. 188 (1998)Google Scholar
  15. [Swe00]
    Sweeney, L.: Uniqueness of simple demographics in the U.S. population. In: LIDAP-WP4. Carnegie Mellon University, Laboratory for International Data Privacy, Pittsburgh, PA (2000)Google Scholar
  16. [Swe02]
    Sweeney, L.: k-Anonymity: A model for protecting privacy. International Journal on Uncertainty Fuzziness Knowledge-based Systems (June 2002)Google Scholar
  17. [Tim97]
    Time. The Death of Privacy (August 1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Gagan Aggarwal
    • 1
  • Tomás Feder
    • 1
  • Krishnaram Kenthapadi
    • 1
  • Rajeev Motwani
    • 1
  • Rina Panigrahy
    • 1
  • Dilys Thomas
    • 1
  • An Zhu
    • 1
  1. 1.Stanford University 

Personalised recommendations