Privacy Problems with Anonymized Transaction Databases

  • Taneli Mielikäinen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3245)

Abstract

In this paper we consider privacy problems with anonymized transaction databases, i.e., transaction databases where the items are renamed in order to hide sensitive information. In particular, we show how an anonymized transaction database can be deanonymized using non-anonymized frequent itemsets. We describe how the problem can be formulated as an integer programming task, study the computational complexity of the problem, discuss how the computations could be done more efficiently in practice and experimentally examine the feasibility of the proposed approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, May 26-28, pp. 207–216. ACM Press, New York (1993)CrossRefGoogle Scholar
  2. 2.
    Farkas, C., Jajodia, S.: The inference problem: A survey. SIGKDD Explorations 4, 6–11 (2002)CrossRefGoogle Scholar
  3. 3.
    Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. SIGMOD Record 33, 50–57 (2004)CrossRefGoogle Scholar
  4. 4.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)Google Scholar
  5. 5.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988) (revised second printing edn.)Google Scholar
  6. 6.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press, London (2001)Google Scholar
  7. 7.
    Calders, T.: Computational complexity of itemset frequency satisfiability. In: Proceedings of the Twenty-Third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Maison de la Chimie, Paris, France, June 13-18, ACM, New York (2004)Google Scholar
  8. 8.
    Mielikäinen, T.: On inverse frequent set mining. In: Du, W., Clifton, C.W. (eds.) Proceedings of the 2nd Workshop on Privacy Preserving Data Mining (PPDM), Melbourne, Florida, USA, November 19, pp. 18–23. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  9. 9.
    Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–865. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent discovery of association rules. SIGMOD Record 30, 45–54 (2001)CrossRefGoogle Scholar
  11. 11.
    Oliveira, S.R.M., Zaïane, O.R.: Privacy preserving frequent itemset mining. In: Clifton, C., Estivill-Castro, V. (eds.) IEEE Workshop on Privacy, Security, and Data Mining. Conferences in Research and Practice in Information Technology, vol. 14 (2002)Google Scholar
  12. 12.
    Oliveira, S.R.M., Zaïane, O.R.: Protecting confidential knowledge by data sanitation. In: Wu, X., Tuzhilin, A., Shavlik, J. (eds.) Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), Melbourne, Florida, USA, December 19-22, pp. 613–616. IEEE Computer Society, Los Alamitos (2003)CrossRefGoogle Scholar
  13. 13.
    Verykios, V.S., Elmagarmid, A.K., Elisa Bertino, F., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16, 434–447 (2004)CrossRefGoogle Scholar
  14. 14.
    Atallah, M.J., Bertino, E., Elmagarmid, A.K., Ibrahim, M., Verykios, V.S.: Disclosure limitation of sensitive rules. In: Proceedings of 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX 1999), pp. 45–52. IEEE Computer Society, Los Alamitos (1999)Google Scholar
  15. 15.
    Oliveira, S.R.M., Zaïane, O.R., Saygin, Y.: Secure association rule sharing. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 74–85. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  16. 16.
    : In: Goethals, B., Zaki, M.J. (eds.) Proceedings of the Workshop on Frequent Itemset Mining Implementations (FIMI 2003), Melbourne, Florida, USA, November 19. CEUR Workshop Proceedings, vol. 90 (2003), http://CEUR-WS.org/Vol-90/
  17. 17.
    Kreher, D.L., Stinson, D.R.: Combinatorial Algorithms: Generation, Enumeration and Search. CRC Press, Boca Raton (1999)MATHGoogle Scholar
  18. 18.
    Torán, J.: On the hardness of graph isomorphism. In: 41st Annual Symposium on Foundations of Computer Science, FOCS 2000, Redondo Beach, California, USA, November 12-14, pp. 180–186. IEEE Computer Society, Los Alamitos (2000)CrossRefGoogle Scholar
  19. 19.
    Padberg, M.: Linear Optimization and Extensions, 2nd edn. Algorithms and Combinatorics, vol. 12. Springer, Heidelberg (1999)MATHGoogle Scholar
  20. 20.
    Martin, A.: General mixed integer programming: Computational issues for branchand- cut algorithms. In: Jünger, M., Naddef, D. (eds.) Computational Combinatorial Optimization. LNCS, vol. 2241, pp. 1–25. Springer, Heidelberg (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Taneli Mielikäinen
    • 1
  1. 1.HIIT Basic Research Unit, Department of Computer ScienceUniversity of HelsinkiFinland

Personalised recommendations