Two New Techniques for Hiding Sensitive Itemsets and Their Empirical Evaluation

HajYasien, Ahmed; Estivill-Castro, Vladimir

doi:10.1007/11823728_29

Ahmed HajYasien¹⁸ &
Vladimir Estivill-Castro¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

776 Accesses
2 Citations

Abstract

Many privacy preserving data mining algorithms attempt to selectively hide what database owners consider as sensitive. Specifically, in the association-rules domain, many of these algorithms are based on item-restriction methods; that is, removing items from some transactions in order to hide sensitive frequent itemsets.

The infancy of this area has not produced clear methods neither evaluated those few available. However, determining what is most effective in protecting sensitive itemsets while not hiding non-sensitive ones as a side effect remains a crucial research issue. This paper introduces two new techniques that deal with scenarios where many itemsets of different sizes are sensitive. We empirically evaluate our two sanitization techniques and compare their efficiency as well as which has the minimum effect on the non-sensitive frequent itemsets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Frequent itemset mining dataset repository, http://fimi.cs.helsinki.fi/data/
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, May 1993, pp. 207–216 (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proc. 20th Int. Conf. Very Large Data Bases, VLDB, December 1994, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: Proc. of 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX 1999), Chicago, IL, November 1999, pp. 45–52 (1999)
Google Scholar
Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: A case study. In: Knowledge Discovery and Data Mining, pp. 254–260 (1999)
Google Scholar
Clifton, C., Kantarcioglu, M., Vaidya, J.: Defining privacy for data mining. In: Proc. of the National Science Foundation Workshop on Next Generation Data Mining, Baltimore, MD, USA, November 2002, pp. 126–133 (2002)
Google Scholar
Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: Proc. of the 4th Information Hiding Workshop, Pittsburg,PA, April 2001, pp. 369–383 (2001)
Google Scholar
Edgar, D.: Data sanitization techniques. White Papers (2004)
Google Scholar
HajYasien, A., Estivill-castro, V., Topor, R.: Sanitization of databases for refined privacy trade-offs. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, Springer, Heidelberg (2006)
Chapter Google Scholar
Han, J., Kamber, M.: Data mining:Concepts and Techniques (2001)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) ACM SIGMOD Intl. Conference on Management of Data, Dallas, May 2000, pp. 1–12. ACM Press, New York (2000)
Chapter Google Scholar
Oliveira, S.R.M., Zaiane, O.R.: Privacy preserving frequent itemset mining. In: Proc. of the IEEE ICDM Workshop on Privacy, Security, and Data Mining, Maebashi City, Japan, December 2002, pp. 43–54 (2002)
Google Scholar
Oliveira, S.R.M., Zaiane, O.R.: Algorithms for balancing privacy and knowledge discovery in association rule mining. In: Proc. of the 7th International Database Engineering and Applications Symposium, China, July 2003, pp. 54–63 (2003)
Google Scholar
Oliveira, S.R.M., Zaiane, O.R., Saygin, Y.: Secure association rule sharing. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 74–85. Springer, Heidelberg (2004)
Chapter Google Scholar
Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent discovery of association rules. SIGMOD Record 30(4), 45–54 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering and Information Technology, Griffith University,
Ahmed HajYasien & Vladimir Estivill-Castro

Authors

Ahmed HajYasien
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Estivill-Castro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9-11/188, A-1040, Wien, Austria
A Min Tjoa
Department of Software and Computing Systems, University of Alicante, Spain
Juan Trujillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

HajYasien, A., Estivill-Castro, V. (2006). Two New Techniques for Hiding Sensitive Itemsets and Their Empirical Evaluation. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_29

Download citation

DOI: https://doi.org/10.1007/11823728_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics