Abstract
Many privacy preserving data mining algorithms attempt to selectively hide what database owners consider as sensitive. Specifically, in the association-rules domain, many of these algorithms are based on item-restriction methods; that is, removing items from some transactions in order to hide sensitive frequent itemsets.
The infancy of this area has not produced clear methods neither evaluated those few available. However, determining what is most effective in protecting sensitive itemsets while not hiding non-sensitive ones as a side effect remains a crucial research issue. This paper introduces two new techniques that deal with scenarios where many itemsets of different sizes are sensitive. We empirically evaluate our two sanitization techniques and compare their efficiency as well as which has the minimum effect on the non-sensitive frequent itemsets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Frequent itemset mining dataset repository, http://fimi.cs.helsinki.fi/data/
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, May 1993, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proc. 20th Int. Conf. Very Large Data Bases, VLDB, December 1994, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: Proc. of 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX 1999), Chicago, IL, November 1999, pp. 45–52 (1999)
Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: A case study. In: Knowledge Discovery and Data Mining, pp. 254–260 (1999)
Clifton, C., Kantarcioglu, M., Vaidya, J.: Defining privacy for data mining. In: Proc. of the National Science Foundation Workshop on Next Generation Data Mining, Baltimore, MD, USA, November 2002, pp. 126–133 (2002)
Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: Proc. of the 4th Information Hiding Workshop, Pittsburg,PA, April 2001, pp. 369–383 (2001)
Edgar, D.: Data sanitization techniques. White Papers (2004)
HajYasien, A., Estivill-castro, V., Topor, R.: Sanitization of databases for refined privacy trade-offs. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, Springer, Heidelberg (2006)
Han, J., Kamber, M.: Data mining:Concepts and Techniques (2001)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) ACM SIGMOD Intl. Conference on Management of Data, Dallas, May 2000, pp. 1–12. ACM Press, New York (2000)
Oliveira, S.R.M., Zaiane, O.R.: Privacy preserving frequent itemset mining. In: Proc. of the IEEE ICDM Workshop on Privacy, Security, and Data Mining, Maebashi City, Japan, December 2002, pp. 43–54 (2002)
Oliveira, S.R.M., Zaiane, O.R.: Algorithms for balancing privacy and knowledge discovery in association rule mining. In: Proc. of the 7th International Database Engineering and Applications Symposium, China, July 2003, pp. 54–63 (2003)
Oliveira, S.R.M., Zaiane, O.R., Saygin, Y.: Secure association rule sharing. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 74–85. Springer, Heidelberg (2004)
Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent discovery of association rules. SIGMOD Record 30(4), 45–54 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
HajYasien, A., Estivill-Castro, V. (2006). Two New Techniques for Hiding Sensitive Itemsets and Their Empirical Evaluation. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_29
Download citation
DOI: https://doi.org/10.1007/11823728_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)