Skip to main content
Log in

Privacy preservation through a greedy, distortion-based rule-hiding method

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Various data mining techniques can be used to discover useful knowledge from large collections of data. However, there is a risk of disclosing sensitive information when data is shared between different organizations. The balance between legitimate mining needs and the protection of confidential knowledge when data is released or shared must be carefully managed. In this paper, we study privacy preservation in association rule mining. A new distortion-based method is proposed which hides sensitive rules by removing some items in a database to reduce the support or confidence of sensitive rules below specified thresholds. In order to minimize side effects on knowledge, the information on non-sensitive itemsets contained by each transaction is used to sort the supporting transactions. The candidates that contain fewer non-sensitive itemsets are selected for modification preferably. In order to reduce the distortion degree on data, the minimum number of transactions that need to be modified to conceal a sensitive rule is derived. Comparative experiments on real datasets showed that the new method can achieve satisfactory results with fewer side effects and data loss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. 1 It should be noted that support and confidence are just two of a variety of interestingness measures that could be employed; [28] identify over 45 such measures.

References

  1. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Record 22(2):207–216

    Article  Google Scholar 

  2. Agrawal R, Srikant R, et al. (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol 1215, pp 487–499

  3. Amiri A (2007) Dare to share: Protecting sensitive knowledge with data sanitization. Decis Support Syst 43 (1):181–191

    Article  Google Scholar 

  4. Gkoulalas-Divanis GLA (2011) Revisiting sequential pattern hiding to enhance utility. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1316–1324

  5. Atallah M, Bertino E, Elmagarmid A, Ibrahim M, Verykios V (1999) Disclosure limitation of sensitive rules. In: Knowledge and Data Engineering Exchange, 1999.(KDEX’99) Proceedings. 1999 Workshop on, pp 45–52. IEEE

  6. Atzori M, Bonchi F, Giannotti F, Pedreschi D (2008) Anonymity preserving pattern discovery. The VLDB Journal The International Journal on Very Large Data Bases 17(4):703–727

    Article  Google Scholar 

  7. Bayardo RJ Jr. (1998) Efficiently mining long patterns from databases. ACM Sigmod Record 27(2):85–93

    Article  Google Scholar 

  8. Bodon F (2003) A fast apriori implementation. In: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations (FIMI03), vol 90

  9. Bodon F (2005) A trie-based apriori implementation for mining frequent item sequences. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, pp 56–65. ACM

  10. Ferenc Bodon (2006) A survey on frequent itemset mining. Budapest University of Technology and Economics, Tech. Rep.

  11. Cheng P, Chu S-C, Lin C-W, Roddick JF (2014) Distortion-based heuristic sensitive rule hiding method–the greedy way. In: Modern Advances in Applied Intelligence, pp 77–86. Springer

  12. Dasseni E, Verykios VS, Elmagarmid AK, Bertino E (2001) Hiding association rules by using confidence and support. In: Information Hiding, pp 369–383. Springer

  13. Fule P, Roddick JF (2004) Detecting privacy and ethical sensitivity in data mining results. In: Proceedings of the 27th Australasian conference on Computer science-Volume 26, pp 159–166. Australian Computer Society, Inc.

  14. Gkoulalas-Divanis A, Haritsa J, Kantarcioglu M (2014) Privacy issues in association rule mining. In: Aggarwal JHCC (ed) Frequent Pattern Mining, chapter 15, pp pp 369–401. Springer International Publishing

  15. Gkoulalas-Divanis A, Verykios VS (2009) Exact knowledge hiding through database extension. IEEE Trans Knowl Data Eng 21(5):699–713

    Article  Google Scholar 

  16. Loukides A G-D G (2013) Hiding sensitive patterns from sequence databases: Research challenges and solutions. In: IEEE 14th International Conference on Mobile Data Management (MDM), pp 45–50

  17. Hong T-P, Lin C-W, Yang K-T, Wang S-L (2013) Using tf-idf to hide sensitive itemsets. Appl Intell 38(4):502–510

    Article  Google Scholar 

  18. Islam Z Md, Brankovic L (2011) Privacy preserving data mining: A noise addition framework using a novel clustering technique. Knowl-Based Syst 24(8):1214–1223

    Article  Google Scholar 

  19. Kohavi C E, Brodley R, et al. (2000) Kdd-cup 2000 organizers’ report: Peeling the onion. ACM SIGKDD Explorations Newsletter 2(2):86–93

    Article  Google Scholar 

  20. Lin C-W, Hong T-P, Chang C-C, Wang S-L (2013) A greedy-based approach for hiding sensitive itemsets by transaction insertion. J Inf Hiding Multim Signal Process 4(4):201–227

    Google Scholar 

  21. Menon S, Sarkar S (2007) Minimizing information loss and preserving privacy. Manag Sci 53(1):101–116

    Article  MATH  Google Scholar 

  22. Menon S, Sarkar S, Mukherjee S (2005) Maximizing accuracy of shared databases when concealing sensitive patterns. Inf Syst Res 16(3):256–270

    Article  Google Scholar 

  23. Oliveira SRM, Zaiane OR (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE international conference on Privacy, security and data mining, vol 14, pp 43–54. Australian Computer Society, Inc.

  24. Pontikakis ED, Theodoridis Y, Tsitsonis AA, Chang L, Verykios VS (2004) A quantitative and qualitative analysis of blocking in association rule hiding. In: Proceedings of the 2004 ACM workshop on Privacy in the electronic society, pp 29–30. ACM

  25. Gwadera G L R, Gkoulalas-Divanis A (2013) Permutation-based sequential pattern hiding. In: IEEE International Conference on Data Mining (ICDM), pp 241–250

  26. Sathiyapriya K, Sadasivam G S (2013) A survey on privacy preserving association rule mining. Int J Data Min Knowl Manag Process 3(2)

  27. Saygin Y, Verykios VS, Clifton C (2001) Using unknowns to prevent discovery of association rules. ACM SIGMOD Record 30(4):45–54

    Article  Google Scholar 

  28. Shillabeer A, Roddick JF (2006) Towards role based hypothesis evaluation for health data mining. Electron J Health Inform 1(1):e6

    Google Scholar 

  29. Verykios VS (2013) Association rule hiding methods. Wiley Interdiscip Rev Data Min Knowl Disc 3(1):28–36

    Article  Google Scholar 

  30. Verykios VS, Elmagarmid AK, Bertino E, Saygin Y, Dasseni E (2004) Association rule hiding. IEEE Trans Knowl Data Eng 16(4):434–447

    Article  Google Scholar 

  31. Verykios VS, Pontikakis ED, Theodoridis Y, Chang LW (2007) Efficient algorithms for distortion and blocking techniques in association rule hiding. Distributed and Parallel Databases 22(1):85–104

    Article  Google Scholar 

  32. Wu Y-H, Chiang C-M, Chen ALP (2007) Hiding sensitive association rules with limited side effects. IEEE Trans Knowl Data Eng 19(1):29–42

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Cheng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, P., Roddick, J.F., Chu, SC. et al. Privacy preservation through a greedy, distortion-based rule-hiding method. Appl Intell 44, 295–306 (2016). https://doi.org/10.1007/s10489-015-0671-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0671-0

Keywords

Navigation