Skip to main content

Distributed Data Mining Protocols for Privacy: A Review of Some Recent Results

  • Conference paper
Book cover Secure Mobile Ad-hoc Networks and Sensors (MADNES 2005)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 4074))

Included in the following conference series:

Abstract

With the rapid advance of the Internet, a large amount of sensitive data is collected, stored, and processed by different parties. Data mining is a powerful tool that can extract knowledge from large amounts of data. Generally, data mining requires that data be collected into a central site. However, privacy concerns may prevent different parties from sharing their data with others. Cryptography provides extremely powerful tools which enable data sharing while protecting data privacy.

In this paper, we briefly survey four recently proposed cryptographic techniques for protecting data privacy in distributed settings. First, we describe a privacy-preserving technique for learning Bayesian networks from a dataset vertically partitioned between two parties. Then, we describe three privacy-preserving data mining techniques in a fully distributed setting where each customer holds a single data record of the database.

This work was supported by the National Science Foundation under Grant No. CCR-0331584.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atallah, M.J., Du, W.: Secure multi-party computational geometry. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 165–179. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  2. Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: Proc. of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 86–97. ACM Press, New York (2003)

    Chapter  Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 439–450. ACM Press, New York (2000)

    Chapter  Google Scholar 

  4. Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings of 21st International Conference on Data Engineering (2005)

    Google Scholar 

  5. Canny, J.: Collaborative filtering with privacy. In: Proceedings of the 2002 IEEE Symposium on Security and Privacy, Washington, DC, USA, pp. 45–57. IEEE Computer Society, Los Alamitos (2002)

    Chapter  Google Scholar 

  6. Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9(4), 309–347 (1992)

    MATH  Google Scholar 

  7. Canetti, R., Ishai, Y., Kumar, R., Reiter, M.K., Rubinfeld, R., Wright, R.N.: Selective private function evaluation with applications to private statistics. In: Proc. of the 20th Annual ACM Symposium on Principles of Distributed Computing, pp. 293–304. ACM Press, New York (2001)

    Google Scholar 

  8. Cranor, L.F.: Special issue on internet privacy. Communications of the ACM 42(2) (1999)

    Google Scholar 

  9. Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: Proc. of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–228. ACM Press, New York (2002)

    Chapter  Google Scholar 

  10. Freedman, M.J., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T.: On private scalar product computation for privacy-preserving data mining. In: Park, C.-s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 104–120. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  12. Goldreich, O., Micali, S., Wigderson, A.: How to play ANY mental game. In: Proc. of the 19th Annual ACM Conference on Theory of Computing, pp. 218–229. ACM Press, New York (1987)

    Google Scholar 

  13. Goldreich, O.: Foundations of Cryptography. Basic Applications, vol. II. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  14. Hirt, M., Sako, K.: Efficient receipt-free voting based on homomorphic encryption. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 539–556. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  15. Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A new privacy-preserving distributed k-clustering algorithm. In: Proceedings of the Sixth SIAM International Conference on Data Mining (2006)

    Google Scholar 

  16. Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proc. of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 593–599. ACM Press, New York (2005)

    Chapter  Google Scholar 

  17. Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. In: Proc. of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD 2002), pp. 24–31 (June 2002)

    Google Scholar 

  18. Kardes, O., Ryger, R.S., Wright, R.N., Feigenbaum, J.: Implementing privacy-preserving Bayesian-net discovery for vertically partitioned data. In: Proceedings of the ICDM Workshop on Privacy and Security Aspects of Data Mining, Houston, TX (2005)

    Google Scholar 

  19. Kantarcioglu, M., Vaidya, J.: Privacy preserving naive Bayes classifier for horizontally partitioned data. In: IEEE Workshop on Privacy Preserving Data Mining (2003)

    Google Scholar 

  20. Liu, K., Kargupta, H., Ryan, J.: Multiplicative noise, random projection, and privacy preserving data mining from distributed multi-party data. Technical Report TR-CS-03-24, Computer Science and Electrical Engineering Department, University of Maryland, Baltimore County (2003)

    Google Scholar 

  21. Lindell, Y., Pinkas, B.: Privacy preserving data mining. J. Cryptology 15(3), 177–206 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  22. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Paris, France (June 2004)

    Google Scholar 

  23. Rizvi, S., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proc. of the 28th VLDB Conference (2002)

    Google Scholar 

  24. Samarati, P.: Protecting respondent’s privacy in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  25. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proc. of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, p. 188. ACM Press, New York (1998)

    Google Scholar 

  26. Sweeney, L.: Guaranteeing anonymity when sharing medical data, the datafly system. Journal of the American Medical Informatics Association (1997)

    Google Scholar 

  27. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness Knowledge-Based Systems 10(5), 571–588 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  28. Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness Knowledge-Based Systems 10(5), 557–570 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  29. Subramaniam, H., Wright, R.N., Yang, Z.: Experimental analysis of privacy-preserving statistics computation. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178, pp. 55–66. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  30. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proc. of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. ACM Press, New York (2002)

    Chapter  Google Scholar 

  31. Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proc. of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206–215. ACM Press, New York (2003)

    Chapter  Google Scholar 

  32. Vaidya, J., Clifton, C.: Privacy preserving naive Bayes classifier on vertically partitioned data. In: 2004 SIAM International Conference on Data Mining (2004)

    Google Scholar 

  33. Yao, A.C.-C.: How to generate and exchange secrets. In: Proc. of the 27th IEEE Symposium on Foundations of Computer Science, pp. 162–167 (1986)

    Google Scholar 

  34. Yang, Z., Wright, R.N.: Privacy-preserving Bayesian network computation on vertically partitioned data. IEEE Transactions on Knowledge and Data Engineering (to appear, 2006)

    Google Scholar 

  35. Yang, Z., Zhong, S., Wright, R.N.: Anonymity-preserving data collection. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2005)

    Google Scholar 

  36. Yang, Z., Zhong, S., Wright, R.N.: Privacy-preserving classification of customer data without loss of accuracy. In: Proceedings of the 2005 SIAM International Conference on Data Mining (2005)

    Google Scholar 

  37. Zhong, S., Yang, Z., Wright, R.N.: Privacy-enhancing k-anonymization of customer data. In: Proceedings of the 24th ACM Symposium on Principles of Database Systems (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wright, R.N., Yang, Z., Zhong, S. (2006). Distributed Data Mining Protocols for Privacy: A Review of Some Recent Results. In: Burmester, M., Yasinsac, A. (eds) Secure Mobile Ad-hoc Networks and Sensors. MADNES 2005. Lecture Notes in Computer Science, vol 4074. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11801412_7

Download citation

  • DOI: https://doi.org/10.1007/11801412_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-36646-1

  • Online ISBN: 978-3-540-37863-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics