k-Anonymous Data Mining: A Survey

  • V. Ciriani
  • S. De Capitani di Vimercati
  • S. Foresti
  • P. Samarati
Part of the Advances in Database Systems book series (ADBS, volume 34)

Data mining technology has attracted significant interest as a means of identifying patterns and trends from large collections of data. It is however evident that the collection and analysis of data that include personal information may violate the privacy of the individuals to whom information refers. Privacy protection in data mining is then becoming a crucial issue that has captured the attention of many researchers.

In this chapter, we first describe the concept of k-anonymity and illustrate different approaches for its enforcement. We then discuss how the privacy requirements characterized by k-anonymity can be violated in data mining and introduce possible approaches to ensure the satisfaction of k-anonymity in data mining.

Keywords

k-anonymity data mining privacy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proc. of the 31th VLDB Conference, Trondheim, Norway, September 2005.Google Scholar
  2. 2.
    Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu. Anonymizing tables. In Proc. of the 10th International Conference on Database Theory (ICDT’05), Edinburgh, Scotland, January 2005.Google Scholar
  3. 3.
    Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu. Approximation algorithms for k-anonymity. Journal of Privacy Technology, November 2005.Google Scholar
  4. 4.
    Dakshi Agrawal and Charu C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proc. of the 20th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, Santa Barbara, California, June 2001.Google Scholar
  5. 5.
    Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules. In Proc. of the 20th VLDB Conference, Santiago, Chile, September 1994.Google Scholar
  6. 6.
    Rakesh Agrawal and Ramakrishnan Srikant. Privacy-preserving data mining. In Proc. of the ACM SIGMOD Conference on Management of Data, Dallas, Texas, May 2000.Google Scholar
  7. 7.
    Maurizio Atzori, Francesco Bonchi, Fosca Giannotti, and Dino Pedreschi. Blocking anonymity threats raised by frequent itemset mining. In Proc. of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston, Texas, November 2005.Google Scholar
  8. 8.
    Maurizio Atzori, Francesco Bonchi, Fosca Giannotti, and Dino Pedreschi. k-anonymous patterns. In Proc. of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Porto, Portugal, October 2005.Google Scholar
  9. 9.
    Maurizio Atzori, Francesco Bonchi, Fosca Giannotti, and Dino Pedreschi. Anonymity preserving pattern discovery. VLDB Journal, November 2006.Google Scholar
  10. 10.
    Roberto J. Bayardo and Rakesh Agrawal. Data privacy through optimal k-anonymization. In Proc. of the International Conference on Data Engineering (ICDE’05), Tokyo, Japan, April 2005.Google Scholar
  11. 11.
    Valentina Ciriani, Sabrina De Capitani di Vimercati, Sara Foresti, and Pierangela Samarati. k-anonymity. In T. Yu and S. Jajodia, editors, Security in Decentralized Data Management. Springer, Berlin Heidelberg, 2007.Google Scholar
  12. 12.
    Valentina Ciriani, Sabrina De Capitani di Vimercati, Sara Foresti, and Pierangela Samarati. Microdata protection. In T. Yu and S. Jajodia, editors, Security in Decentralized Data Management. Springer, Berlin Heidelberg, 2007.Google Scholar
  13. 13.
    Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. Privacy preserving mining of association rules. In Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, July 2002.Google Scholar
  14. 14.
    Federal Committee on Statistical Methodology. Statistical policy working paper 22, May 1994. Report on Statistical Disclosure Limitation Methodology.Google Scholar
  15. 15.
    Arik Friedman, Assaf Schuster, and Ran Wolff. Providing k-anonymity in data mining. VLDB Journal. Forthcoming.Google Scholar
  16. 16.
    Benjamin C.M. Fung, Ke Wang, and Philip S. Yu. Anonymizing classification data for privacy preservation. IEEE Transactions on Knowledge and Data Engineering, 19(5):711–725, May 2007.CrossRefGoogle Scholar
  17. 17.
    Michael R. Garey and David S. Johnson Computers and Intractability. W. H. Freeman & Co., New York, NY, USA, 1979.MATHGoogle Scholar
  18. 18.
    Kristen LeFevre, David J. DeWitt, and Raghu Ramakrishnan. Incognito: efficient full-domain k-anonymity. In Proc. of the ACM SIGMOD Conference on Management of Data, Baltimore, Maryland, June 2005.Google Scholar
  19. 19.
    Kristen LeFevre, David J. DeWitt, and Raghu Ramakrishnan. Mondrian multidimensional k-anonymity. In Proc. of the International Conference on Data Engineering (ICDE’06), Atlanta, Georgia, April 2006.Google Scholar
  20. 20.
    Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177–206, June 2002.MATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Ashwin Machanavajjhala, Johannes Gehrke, and Daniel Kifer. -density: Privacy beyond k-anonymity. In Proc. of the International Conference on Data Engineering (ICDE’06), Atlanta, Georgia, April 2006.Google Scholar
  22. 22.
    Adam Meyerson and Ryan Williams On the complexity of optimal k-anonymity. In Proc. of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Paris, France, June 2004.Google Scholar
  23. 23.
    Hyoungmin Park and Kyuseok Shim. Approximate algorithms for k-anonymity. In Proc. of the ACM SIGMOD Conference on Management of Data, Beijing, China, June 2007.Google Scholar
  24. 24.
    Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal. Discovering frequent closed itemsets for association rules. In Proc. of the 7th International Conference on Database Theory (ICDT ’99), Jerusalem, Israel, January 1999.Google Scholar
  25. 25.
    Rajeev Rastogi and Kyuseok Shim. PUBLIC: A decision tree classifier that integrates building and pruning. In Proc. of the 24th VLDB Conference, New York, September 1998.Google Scholar
  26. 26.
    Pierangela Samarati. Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027, November 2001.CrossRefGoogle Scholar
  27. 27.
    Pierangela Samarati and Latanya Sweeney. Generalizing data to provide anonymity when disclosing information (abstract). In Proc. of the 17th ACM-SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems,Seattle,WA,188,1998Google Scholar
  28. 28.
    Ramakrishnan Srikant and Rakesh Agrawal. Mining generalized association rules. In Proc. of the 21th VLDB Conference, Zurich, Switzerland, September 1995.Google Scholar
  29. 29.
    Ke Wang, Philip S. Yu, and Sourav Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In Proc. of the 4th IEEE International Conference on Data Mining (ICDM 2004), Brighton, UK, November 2004.Google Scholar
  30. 30.
    Zhiqiang Yang, Sheng Zhong, and Rebecca N. Wright. Privacy-preserving classification of customer data without loss of accuracy. In Proc. of the 5th SIAM International Conference on Data Mining, Newport Beach, California, April 2005.Google Scholar
  31. 31.
    Mohammed J. Zaki and Ching-Jui Hsiao. Charm: An efficient algorithm for closed itemset mining. In Proc. of the 2nd SIAM International Conference on Data Mining, Arlington, Virginia, April 2002.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • V. Ciriani
    • 1
  • S. De Capitani di Vimercati
    • 1
  • S. Foresti
    • 1
  • P. Samarati
    • 1
  1. 1.DTI - Universitàdegli Studi di MilanoItaly

Personalised recommendations