Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

IFIP Annual Conference on Data and Applications Security and Privacy

DBSec 2012: Data and Applications Security and Privacy XXVI pp 129–144Cite as

  1. Home
  2. Data and Applications Security and Privacy XXVI
  3. Conference paper
Approximate Privacy-Preserving Data Mining on Vertically Partitioned Data

Approximate Privacy-Preserving Data Mining on Vertically Partitioned Data

  • Robert Nix17,
  • Murat Kantarcioglu17 &
  • Keesook J. Han18 
  • Conference paper
  • 2045 Accesses

  • 4 Citations

Part of the Lecture Notes in Computer Science book series (LNISA,volume 7371)

Abstract

In today’s ever-increasingly digital world, the concept of data privacy has become more and more important. Researchers have developed many privacy-preserving technologies, particularly in the area of data mining and data sharing. These technologies can compute exact data mining models from private data without revealing private data, but are generally slow. We therefore present a framework for implementing efficient privacy-preserving secure approximations of data mining tasks. In particular, we implement two sketching protocols for the scalar (dot) product of two vectors which can be used as sub-protocols in larger data mining tasks. These protocols can lead to approximations which have high accuracy, low data leakage, and one to two orders of magnitude improvement in efficiency. We show these accuracy and efficiency results through extensive experimentation. We also analyze the security properties of these approximations under a security definition which, in contrast to previous definitions, allows for very efficient approximation protocols.

Keywords

  • Data Mining
  • Association Rule Mining
  • Random Projection
  • Privacy Preserve
  • Data Mining Task

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Download conference paper PDF

References

  1. Achlioptas, D.: Database-friendly random projections: Johnson-lindenstrauss with binary coins. Journal of Computer and System Sciences 66(4), 671–687 (2003)

    CrossRef  MathSciNet  MATH  Google Scholar 

  2. Aggarwal, C., Yu, P.: A general survey of privacy-preserving data mining models and algorithms. In: Privacy-Preserving Data Mining, pp. 11–52 (2008)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM Sigmod Record 29, 439–450 (2000)

    CrossRef  Google Scholar 

  4. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

  5. Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations Newsletter 4(2), 28–34 (2002)

    CrossRef  Google Scholar 

  6. Du, W., Atallah, M.: Privacy-preserving cooperative statistical analysis. In: Proceedings of the 17th Annual Computer Security Applications Conference, p. 102. IEEE Computer Society (2001)

    Google Scholar 

  7. Dwork, C.: Differential Privacy: A Survey of Results. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008)

    CrossRef  Google Scholar 

  8. Feigenbaum, J., Ishai, Y., Malkin, T., Nissim, K., Strausse, M., Wright, R.: Secure multiparty computation of approximations. ACM Transactions on Algorithms (TALG) 2(3), 435–472 (2006)

    CrossRef  MathSciNet  MATH  Google Scholar 

  9. Fradkin, D., Madigan, D.: Experiments with random projections for machine learning. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–522. ACM (2003)

    Google Scholar 

  10. Goethals, B.: Frequent itemset mining implementations repository (2005)

    Google Scholar 

  11. Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T.: On Private Scalar Product Computation for Privacy-Preserving Data Mining. In: Park, C.-S., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 104–120. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  12. Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301), 13–30 (1965)

    CrossRef  MathSciNet  MATH  Google Scholar 

  13. Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data (2005)

    Google Scholar 

  14. Ioannidis, I., Grama, A., Attallah, M.: A secure protocol for computing the dot-products in clustered and distributed environments. In: International Conference on Parallel Processing, 2002, pp. 379–384. IEEE (2002)

    Google Scholar 

  15. Johnson, W., Lindenstrauss, J.: Extensions of lipschitz mappings into a hilbert space. Contemporary Mathematics 26(189-206), 1 (1984)

    MathSciNet  MATH  Google Scholar 

  16. Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16(9), 1026–1037 (2004)

    CrossRef  Google Scholar 

  17. Kantarcioglu, M., Nix, R., Vaidya, J.: An Efficient Approximate Protocol for Privacy-Preserving Association Rule Mining. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 515–524. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  18. Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 99–106. IEEE (2003)

    Google Scholar 

  19. Li, P., Hastie, T., Church, K.: Very sparse random projections. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 287–296. ACM (2006)

    Google Scholar 

  20. Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)

    CrossRef  Google Scholar 

  21. Liu, K., Giannella, C., Kargupta, H.: An Attacker’s View of Distance Preserving Maps for Privacy Preserving Data Mining. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 297–308. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  22. Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering, 92–106 (2006)

    Google Scholar 

  23. Menezes, A., Van Oorschot, P., Vanstone, S.: Handbook of applied cryptography. CRC (1997)

    Google Scholar 

  24. Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4(2), 12–19 (2002)

    CrossRef  Google Scholar 

  25. Qiu, L., Li, Y., Wu, X.: Preserving privacy in association rule mining with bloom filters. Journal of Intelligent Information Systems 29(3), 253–278 (2007)

    CrossRef  Google Scholar 

  26. Ravikumar, P., Cohen, W., Feinberg, S.: A secure protocol for computing string distance metrics. In: Proceedings of the Workshop on Privacy and Security Aspects of Data Mining at the International Conference on Data Mining, pp. 40–46. IEEE (2004)

    Google Scholar 

  27. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. ACM (2002)

    Google Scholar 

  28. Vaidya, J., Clifton, C.: Privacy preserving naıve bayes classifier for vertically partitioned data. In: 2004 SIAM International Conference on Data Mining, Lake Buena Vista, Florida, pp. 522–526 (2004)

    Google Scholar 

  29. Vaidya, J., Clifton, C.: Privacy-Preserving Decision Trees over Vertically Partitioned Data. In: Jajodia, S., Wijesekera, D. (eds.) Data and Applications Security 2005. LNCS, vol. 3654, pp. 139–152. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  30. Vaidya, J., Clifton, C.: Secure set intersection cardinality with application to association rule mining. Journal of Computer Security 13(4), 593–622 (2005)

    CrossRef  Google Scholar 

  31. Wang, W., Garofalakis, M., Ramchandran, K.: Distributed sparse random projections for refinable approximation. In: Proceedings of the 6th International Conference on Information Processing in Sensor Networks, pp. 331–339. ACM (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Jonsson School of Engineering and Computer Science, The University of Texas at Dallas, 800 West Campbell Road, Richardson, Texas, USA

    Robert Nix & Murat Kantarcioglu

  2. Air Force Research Laboratory, Information Directorate, 525 Brooks Road, Rome, New York, USA

    Keesook J. Han

Authors
  1. Robert Nix
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Murat Kantarcioglu
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Keesook J. Han
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Télécom Bretagne, Campus de Rennes 2, rue de la Châtaigneraie, 35512, Cesson Sévigné Cedex, France

    Nora Cuppens-Boulahia, Frédéric Cuppens & Joaquin Garcia-Alfaro,  & 

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 IFIP International Federation for Information Processing

About this paper

Cite this paper

Nix, R., Kantarcioglu, M., Han, K.J. (2012). Approximate Privacy-Preserving Data Mining on Vertically Partitioned Data. In: Cuppens-Boulahia, N., Cuppens, F., Garcia-Alfaro, J. (eds) Data and Applications Security and Privacy XXVI. DBSec 2012. Lecture Notes in Computer Science, vol 7371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31540-4_11

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-31540-4_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31539-8

  • Online ISBN: 978-3-642-31540-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature