Skip to main content
Log in

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a scalable, local privacy-preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization-based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.agnik.com/PursuitFlyer.pdf

  2. http://www.torproject.org/

  3. http://www.cs.bu.edu/brite/

References

  1. Bawa M, Garcia-Molina H, Gionis A, Motwani R (2003) Estimating aggregates on a Peer-to-Peer network. Technical report, Stanford University

  2. Bhaduri K, Srivastava A (2009) A local scalable distributed expectation maximization algorithm for large Peer-to-Peer networks. In: Proceedings of ICDM’09, Miami, FL, pp 31–40

  3. Bhaduri K, Wolff R, Giannella C, Kargupta H (2008) Distributed decision tree induction in Peer-to-Peer systems. Statistical Analysis and Data Mining (SAM) 1(2):85–103

    Article  MathSciNet  Google Scholar 

  4. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  5. Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu M (2003) Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations 4(2):1–7

    Google Scholar 

  6. Das K, Bhaduri K, Liu K, Kargupta H (2008) Distributed identification of top-l inner product elements and its application in a Peer-to-Peer network. TKDE 20(4):475–488

    Google Scholar 

  7. Datta S, Bhaduri K, Giannella C, Wolff R, Kargupta H (2006) Distributed data mining in Peer-to-Peer networks. IEEE Internet Computing 10(4):18–26

    Article  Google Scholar 

  8. Datta S, Giannella C, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proceedings of SDM’06, MD, pp 153–164

  9. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New York

    MATH  Google Scholar 

  10. Evfimevski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proc. of SIGMOD’03, San Diego, CA

  11. Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02)

  12. Gilburd B, Schuster A, Wolff R (2004) k-TTP: a new privacy model for large-scale distributed environments. In: Proc. of KDD’04, Seattle, pp 563–568

  13. Kargupta H, Das K, Liu K (2007) Multi-party, privacy-preserving distributed data mining using a game theoretic framework. In: Proc. of PKDD’07, pp 523–531

  14. Kargupta H, Sivakumar K (2004) Existential pleasures of distributed data mining. Data mining: next generation challenges and future directions. AAAI/MIT Press, Cambridge

    Google Scholar 

  15. Kargupta H, Chan P (eds) Advances in distributed and parallel knowledge discovery. MIT Press, Cambridge

  16. Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings of ICDE’07, pp 106–115

  17. Liu K, Bhaduri K, Das K, Nguyen P, Kargupta H (2006) Client-side web mining for community formation in Peer-to-Peer environments. SIGKDD Explorations 8(2):11–20

    Article  MATH  Google Scholar 

  18. Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramanian M (2006) l-diversity: privacy beyond k-anonymity. In: Proc. of ICDE’06, GA, p 24

  19. Mane S, Mopuru S, Mehra K, Srivastava J (2005) Network size estimation in a Peer-to-Peer network. Technical Report 05-030, University of Minnesota

  20. Mehyar M, Spanos D, Pongsajapan J, Low SH, Murray R (2005) Distributed averaging on Peer-to-Peer networks. In: Proc. of CDC’05, Spain

  21. Scherber D, Papadopoulos H (2005) Distributed computation of averages over ad hoc networks. IEEE J Sel Areas Commun 23(4):776–787

    Article  Google Scholar 

  22. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):557–570

    Article  MATH  MathSciNet  Google Scholar 

  23. Teng Z, Du W (2009) Hybrid multi-group approach for privacy-preserving data mining. Knowl Inf Syst 19(2):133–157

    Article  Google Scholar 

  24. Trottini M, Fienberg S, Makov U, Meyer M (2004) Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: a simulation study. J Comput Methods Sci Eng 4(1,2):5–16

    MATH  Google Scholar 

  25. Wolff R, Bhaduri K, Kargupta H (2009) A generic local algorithm for mining data streams in large distributed systems. IEEE Trans Knowl Data Eng 21(4):465–478

    Article  Google Scholar 

  26. Wolff R, Schuster A (2004) Association rule mining in Peer-to-Peer systems. IEEE SMC Part B 34(6):2426–2438

    Google Scholar 

  27. Yao AC (1986) How to generate and exchange secrets (extended abstract). In: FOCS, pp 162–167

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kamalika Das.

Additional information

A shorter version of this paper was published in IEEE P2P’09 conference. This work was supported by AFOSR MURI grant 2009-11.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Das, K., Bhaduri, K. & Kargupta, H. Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks. Peer-to-Peer Netw. Appl. 4, 192–209 (2011). https://doi.org/10.1007/s12083-010-0075-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-010-0075-1

Keywords

Navigation