A Survey of Association Rule Hiding Methods for Privacy

  • Vassilios S. Verykios
  • Aris Gkoulalas-Divanis
Part of the Advances in Database Systems book series (ADBS, volume 34)

Data and knowledge hiding are two research directions that investigate how the privacy of raw data, or information, can be maintained either before or after the course of mining the data. By focusing on the knowledge hiding thread, we present a taxonomy and a survey of recent approaches that have been applied to the association rule hiding problem. Association rule hiding refers to the process of modifying the original database in such a way that certain sensitive association rules disappear without seriously affecting the data and the non-sensitive rules. We also provide a thorough comparison of the presented approaches, and we touch upon hiding approaches used for other data mining tasks. A detailed presentation of metrics used to evaluate the performance of those approaches is also given. Finally, we conclude our study by enumerating interesting future directions in this research body.

Keywords

Privacy preserving data mining knowledge hiding frequent itemset hiding association rule hiding 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    O. Abul, M. Atzori, F. Bonchi, and F. Giannotti. Hiding sequences. Technical report, Pisa KDD Laboratory, ISTI-CNR, Area della Ricerca di Pisa, Nov. 2006.Google Scholar
  2. 2.
    R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD’93), pages 207–216, 1993.Google Scholar
  3. 3.
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Databases (VLDB’94), pages 487–499, 1994.Google Scholar
  4. 4.
    A. Amiri. Dare to share: Protecting sensitive knowledge with data sanitization. Decision Support Systems, 43(1):181–191, 2007.CrossRefGoogle Scholar
  5. 5.
    M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, and V. S. Verykios. Disclosure limitation of sensitive rules. In Proceedings of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX’99), pages 45–52, 1999.Google Scholar
  6. 6.
    E. Bertino, I. N. Fovino, and L. P. Povenza. A framework for evaluating privacy preserving data mining algorithms. Data Mining and Knowledge Discovery, 11(2):121–154, 2005.CrossRefMathSciNetGoogle Scholar
  7. 7.
    L. Chang and I. S. Moskowitz. Parsimonious downgrading and decision trees applied to the inference problem. In Workshop on New Security Paradigms, 1998.Google Scholar
  8. 8.
    X. Chen, M. Orlowska, and X. Li. A new framework of privacy preserving data sharing. In Proceedings of the 4th IEEE International Workshop on Privacy and Security Aspects of Data Mining, pages 47–56, 2004.Google Scholar
  9. 9.
    C. Clifton and D. Marks. Security and privacy implications of data mining. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (SIGMOD’96), pages 15–19, Feb. 1996.Google Scholar
  10. 10.
    J. C. da Silva and M. Klusch. Inference on distributed data clustering. In Proceedings of the 4th International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM 2005), pages 610–619, 2005.Google Scholar
  11. 11.
    E. Dasseni, V. S. Verykios, A. K. Elmagarmid, and E. Bertino. Hiding association rules by using confidence and support. In Proceedings of the 4th International Workshop on Information Hiding, pages 369–383, 2001.Google Scholar
  12. 12.
    C. Farkas and S. Jajodia. The inference problem: A survey. ACM SIGKDD Exploration Newsletter, 4(2):6–11, 2002.CrossRefGoogle Scholar
  13. 13.
    A. Gkoulalas-Divanis and V. S. Verykios. An integer programming approach for frequent itemset hiding. In Proceedings of the 2006 ACM Conference on Information and Knowledge Management (CIKM 2006), pages 748–757, 2006.Google Scholar
  14. 14.
    A. Ìnan and Y. Saygin. Privacy preserving spatio-temporal clustering on horizontally partitioned data. In Proceedings of the 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006), pages 459–468, 2006.Google Scholar
  15. 15.
    G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright. A new privacy-preserving distributed k-clustering algorithm. In Proceedings of the 2006 SIAM International Conference on Data Mining (SDM 2006), 2006.Google Scholar
  16. 16.
    S. Jha, L. Kruger, and P. McDaniel. Privacy preserving clustering. In Proceedings of the 10th European Symposium on Research in Computer Security (ESORICS 2005), pages 397–417, 2005.Google Scholar
  17. 17.
    G. Lee, C.-Y. Chang, and A. L. P. Chen. Hiding sensitive patterns in association rules mining. In 28th Annual International Computer Software and Applications Conference (COMPSAC 2004), pages 424–429, 2004.Google Scholar
  18. 18.
    H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997.CrossRefGoogle Scholar
  19. 19.
    S. Menon, S. Sarkar, and S. Mukherjee. Maximizing accuracy of shared databases when concealing sensitive patterns. Information Systems Research, 16(3):256–270, 2005.CrossRefGoogle Scholar
  20. 20.
    T. Mielikainen. On inverse frequent set mining. In W. Du and C. W. Clifton, editors, Proceedings of the 2nd Workshop on Privacy Preserving Data Mining, pages 18–23, 2003.Google Scholar
  21. 21.
    G. V. Moustakides and V. S. Verykios. A max-min approach for hiding frequent itemsets. In Workshops Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), pages 502–506, 2006.Google Scholar
  22. 22.
    J. Natwichai, X. Li, and M. Orlowska. Hiding classification rules for data sharing with privacy preservation. In Proceedings of the 7th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2005), pages 468–477, 2005.Google Scholar
  23. 23.
    J. Natwichai, X. Li, and M. Orlowska. A reconstruction-based algorithm for classiciation rules hiding. In Proceedings of the 17th Australasian Database Conference (ADC 2006), pages 49–58, 2006.Google Scholar
  24. 24.
    D. E. O’Leary. Knowledge discovery as a threat to database security. In Proceedings of the 1st International Conference on Knowledge Discovery in Databases, pages 507–516, 1991.Google Scholar
  25. 25.
    S. R. M. Oliveira and O. R. Zaïane. Privacy preserving frequent itemset mining. In Proceedings of the 2002 IEEE International Conference on Privacy, Security and Data Mining (CRPITS 2002), pages 43–54, 2002.Google Scholar
  26. 26.
    S. R. M. Oliveira and O. R. Zaïane. Protecting sensitive knowledge by data sanitization. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM 2003), pages 211–218, 2003.Google Scholar
  27. 27.
    S. R. M. Oliveira and O. R. Zaïane. Achieving privacy preservation when sharing data for clustering. In Proceedings of the 2004 SIAM International Conference on Data Mining (SDM 2004), 2004.Google Scholar
  28. 28.
    S. R. M. Oliveira and O. R. Zaïane. Privacy-preserving clustering by object similarity-based representation and dimensionality reduction transformation. In Proceedings of the Second IEEE International Conference on Data Mining (ICDM 2004), pages 21–30, 2004.Google Scholar
  29. 29.
    S. R. M. Oliveira and O. R. Zaiane. A unified framework for protecting sensitive association rules in business collaboration. International Journal of Business Intelligence and Data Mining, 1(3):247–287, 2006.Google Scholar
  30. 30.
    E. Pontikakis, Y. Theodoridis, A. Tsitsonis, L. Chang, and V. S. Verykios. A quantitative and qualitative analysis of blocking in association rule hiding. In Proceedings of the 2004 ACM Workshop on Privacy in the Electronic Society (WPES 2004), pages 29–30, 2004.Google Scholar
  31. 31.
    E. D. Pontikakis, A. A. Tsitsonis, and V. S. Verykios. An experimental study of distortion-based techniques for association rule hiding. In Proceedings of the 18th Conference on Database Security (DBSEC 2004), pages 325–339, 2004.Google Scholar
  32. 32.
    S. Rizvi and J. R. Haritsa. Maintaining data privacy in association rule mining. In Proceedings of the 28th International Conference on Very Large Databases (VLDB 2002), 2002.Google Scholar
  33. 33.
    Y. Saygin, V. S. Verykios, and C. Clifton. Using unknowns to prevent discovery of association rules. ACM SIGMOD Record, 30(4):45–54, 2001.CrossRefGoogle Scholar
  34. 34.
    Y. Saygin, V. S. Verykios, and A. K. Elmagarmid. Privacy preserving association rule mining. In Proceedings of the 2002 International Workshop on Research Issues in Data Engineering: Engineering E-Commerce/E-Business Systems (RIDE 2002), pages 151–163, 2002.Google Scholar
  35. 35.
    X. Sun and P. S. Yu. A border-based approach for hiding sensitive frequent itemsets. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM 2005), pages 426–433, 2005.Google Scholar
  36. 36.
    V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Saygin, and Y. Theodoridis. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 33(1):50–57, 2004.CrossRefGoogle Scholar
  37. 37.
    V. S. Verykios, A. K. Emagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4):434–447, 2004.CrossRefGoogle Scholar
  38. 38.
    K. Wang, B. C. M. Fung, and P. S. Yu. Template-based privacy preservation in classification problems. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM 2005), pages 466–473, 2005.Google Scholar
  39. 39.
    S.-L. Wang and A. Jafari. Using unknowns for hiding sensitive predictive association rules. In Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration (IRI 2005), pages 223–228, 2005.Google Scholar
  40. 40.
    X. Wu, Y. Wu, Y. Wang, and Y. Li. Privacy aware market basket data set generation: A feasible approach for inverse frequent set mining. In Proceedings of the 2005 SIAM International Conference on Data Mining (SDM 2005), 2005.Google Scholar
  41. 41.
    Y.-H. Wu, C.-M. Chiang, and A. L. P. Chen. Hiding sensitive association rules with limited side effects. IEEE Transactions on Knowledge and Data Engineering, 19(1):29–42, 2007.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Vassilios S. Verykios
    • 1
  • Aris Gkoulalas-Divanis
    • 1
  1. 1.Dept. of Computer and Communication EngineeringUniversity of ThessalyTucsonGreece

Personalised recommendations