Skip to main content
Log in

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a scalable, local privacy-preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization-based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.agnik.com/PursuitFlyer.pdf

  2. http://www.torproject.org/

  3. http://www.cs.bu.edu/brite/

References

  1. Bawa M, Garcia-Molina H, Gionis A, Motwani R (2003) Estimating aggregates on a Peer-to-Peer network. Technical report, Stanford University

  2. Bhaduri K, Srivastava A (2009) A local scalable distributed expectation maximization algorithm for large Peer-to-Peer networks. In: Proceedings of ICDM’09, Miami, FL, pp 31–40

  3. Bhaduri K, Wolff R, Giannella C, Kargupta H (2008) Distributed decision tree induction in Peer-to-Peer systems. Statistical Analysis and Data Mining (SAM) 1(2):85–103

    Article  MathSciNet  Google Scholar 

  4. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  5. Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu M (2003) Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations 4(2):1–7

    Google Scholar 

  6. Das K, Bhaduri K, Liu K, Kargupta H (2008) Distributed identification of top-l inner product elements and its application in a Peer-to-Peer network. TKDE 20(4):475–488

    Google Scholar 

  7. Datta S, Bhaduri K, Giannella C, Wolff R, Kargupta H (2006) Distributed data mining in Peer-to-Peer networks. IEEE Internet Computing 10(4):18–26

    Article  Google Scholar 

  8. Datta S, Giannella C, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proceedings of SDM’06, MD, pp 153–164

  9. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New York

    MATH  Google Scholar 

  10. Evfimevski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proc. of SIGMOD’03, San Diego, CA

  11. Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02)

  12. Gilburd B, Schuster A, Wolff R (2004) k-TTP: a new privacy model for large-scale distributed environments. In: Proc. of KDD’04, Seattle, pp 563–568

  13. Kargupta H, Das K, Liu K (2007) Multi-party, privacy-preserving distributed data mining using a game theoretic framework. In: Proc. of PKDD’07, pp 523–531

  14. Kargupta H, Sivakumar K (2004) Existential pleasures of distributed data mining. Data mining: next generation challenges and future directions. AAAI/MIT Press, Cambridge

    Google Scholar 

  15. Kargupta H, Chan P (eds) Advances in distributed and parallel knowledge discovery. MIT Press, Cambridge

  16. Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings of ICDE’07, pp 106–115

  17. Liu K, Bhaduri K, Das K, Nguyen P, Kargupta H (2006) Client-side web mining for community formation in Peer-to-Peer environments. SIGKDD Explorations 8(2):11–20

    Article  MATH  Google Scholar 

  18. Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramanian M (2006) l-diversity: privacy beyond k-anonymity. In: Proc. of ICDE’06, GA, p 24

  19. Mane S, Mopuru S, Mehra K, Srivastava J (2005) Network size estimation in a Peer-to-Peer network. Technical Report 05-030, University of Minnesota

  20. Mehyar M, Spanos D, Pongsajapan J, Low SH, Murray R (2005) Distributed averaging on Peer-to-Peer networks. In: Proc. of CDC’05, Spain

  21. Scherber D, Papadopoulos H (2005) Distributed computation of averages over ad hoc networks. IEEE J Sel Areas Commun 23(4):776–787

    Article  Google Scholar 

  22. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):557–570

    Article  MATH  MathSciNet  Google Scholar 

  23. Teng Z, Du W (2009) Hybrid multi-group approach for privacy-preserving data mining. Knowl Inf Syst 19(2):133–157

    Article  Google Scholar 

  24. Trottini M, Fienberg S, Makov U, Meyer M (2004) Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: a simulation study. J Comput Methods Sci Eng 4(1,2):5–16

    MATH  Google Scholar 

  25. Wolff R, Bhaduri K, Kargupta H (2009) A generic local algorithm for mining data streams in large distributed systems. IEEE Trans Knowl Data Eng 21(4):465–478

    Article  Google Scholar 

  26. Wolff R, Schuster A (2004) Association rule mining in Peer-to-Peer systems. IEEE SMC Part B 34(6):2426–2438

    Google Scholar 

  27. Yao AC (1986) How to generate and exchange secrets (extended abstract). In: FOCS, pp 162–167

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kamalika Das.

Additional information

A shorter version of this paper was published in IEEE P2P’09 conference. This work was supported by AFOSR MURI grant 2009-11.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Das, K., Bhaduri, K. & Kargupta, H. Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks. Peer-to-Peer Netw. Appl. 4, 192–209 (2011). https://doi.org/10.1007/s12083-010-0075-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-010-0075-1

Keywords

Navigation