Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

Das, Kamalika; Bhaduri, Kanishka; Kargupta, Hillol

doi:10.1007/s12083-010-0075-1

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

Published: 22 June 2010

Volume 4, pages 192–209, (2011)
Cite this article

Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Kamalika Das¹,
Kanishka Bhaduri² &
Hillol Kargupta^3,4

475 Accesses
4 Citations
Explore all metrics

Abstract

This paper proposes a scalable, local privacy-preserving algorithm for distributed Peer-to-Peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions and it is highly scalable. It particularly deals with the distributed computation of the sum of a set of numbers stored at different peers in a P2P network in the context of a P2P web mining application. The proposed optimization-based privacy-preserving technique for computing the sum allows different peers to specify different privacy requirements without having to adhere to a global set of parameters for the chosen privacy model. Since distributed sum computation is a frequently used primitive, the proposed approach is likely to have significant impact on many data mining tasks such as multi-party privacy-preserving clustering, frequent itemset mining, and statistical aggregate computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stratified random sampling from streaming and stored data

Article 23 October 2020

On the nature and types of anomalies: a review of deviations in data

Article Open access 04 August 2021

Music Recommendation Systems: Overview and Challenges

Notes

References

Bawa M, Garcia-Molina H, Gionis A, Motwani R (2003) Estimating aggregates on a Peer-to-Peer network. Technical report, Stanford University
Bhaduri K, Srivastava A (2009) A local scalable distributed expectation maximization algorithm for large Peer-to-Peer networks. In: Proceedings of ICDM’09, Miami, FL, pp 31–40
Bhaduri K, Wolff R, Giannella C, Kargupta H (2008) Distributed decision tree induction in Peer-to-Peer systems. Statistical Analysis and Data Mining (SAM) 1(2):85–103
Article MathSciNet Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
MATH Google Scholar
Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu M (2003) Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations 4(2):1–7
Google Scholar
Das K, Bhaduri K, Liu K, Kargupta H (2008) Distributed identification of top-l inner product elements and its application in a Peer-to-Peer network. TKDE 20(4):475–488
Google Scholar
Datta S, Bhaduri K, Giannella C, Wolff R, Kargupta H (2006) Distributed data mining in Peer-to-Peer networks. IEEE Internet Computing 10(4):18–26
Article Google Scholar
Datta S, Giannella C, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proceedings of SDM’06, MD, pp 153–164
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New York
MATH Google Scholar
Evfimevski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proc. of SIGMOD’03, San Diego, CA
Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02)
Gilburd B, Schuster A, Wolff R (2004) k-TTP: a new privacy model for large-scale distributed environments. In: Proc. of KDD’04, Seattle, pp 563–568
Kargupta H, Das K, Liu K (2007) Multi-party, privacy-preserving distributed data mining using a game theoretic framework. In: Proc. of PKDD’07, pp 523–531
Kargupta H, Sivakumar K (2004) Existential pleasures of distributed data mining. Data mining: next generation challenges and future directions. AAAI/MIT Press, Cambridge
Google Scholar
Kargupta H, Chan P (eds) Advances in distributed and parallel knowledge discovery. MIT Press, Cambridge
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings of ICDE’07, pp 106–115
Liu K, Bhaduri K, Das K, Nguyen P, Kargupta H (2006) Client-side web mining for community formation in Peer-to-Peer environments. SIGKDD Explorations 8(2):11–20
Article MATH Google Scholar
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramanian M (2006) l-diversity: privacy beyond k-anonymity. In: Proc. of ICDE’06, GA, p 24
Mane S, Mopuru S, Mehra K, Srivastava J (2005) Network size estimation in a Peer-to-Peer network. Technical Report 05-030, University of Minnesota
Mehyar M, Spanos D, Pongsajapan J, Low SH, Murray R (2005) Distributed averaging on Peer-to-Peer networks. In: Proc. of CDC’05, Spain
Scherber D, Papadopoulos H (2005) Distributed computation of averages over ad hoc networks. IEEE J Sel Areas Commun 23(4):776–787
Article Google Scholar
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):557–570
Article MATH MathSciNet Google Scholar
Teng Z, Du W (2009) Hybrid multi-group approach for privacy-preserving data mining. Knowl Inf Syst 19(2):133–157
Article Google Scholar
Trottini M, Fienberg S, Makov U, Meyer M (2004) Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: a simulation study. J Comput Methods Sci Eng 4(1,2):5–16
MATH Google Scholar
Wolff R, Bhaduri K, Kargupta H (2009) A generic local algorithm for mining data streams in large distributed systems. IEEE Trans Knowl Data Eng 21(4):465–478
Article Google Scholar
Wolff R, Schuster A (2004) Association rule mining in Peer-to-Peer systems. IEEE SMC Part B 34(6):2426–2438
Google Scholar
Yao AC (1986) How to generate and exchange secrets (extended abstract). In: FOCS, pp 162–167

Download references

Author information

Authors and Affiliations

Stinger Ghaffarian Technologies Inc., NASA Ames Research Center, MS 269-3, Moffett Field, CA, 94035, USA
Kamalika Das
Mission Critical Technologies Inc., NASA Ames Research Center, MS 269-2, Moffett Field, CA, 94035, USA
Kanishka Bhaduri
CSEE Dept., University of Maryland, Baltimore County, MD, 21250, USA
Hillol Kargupta
AGNIK LLC, Columbia, MD, 21045, USA
Hillol Kargupta

Authors

Kamalika Das
View author publications
You can also search for this author in PubMed Google Scholar
Kanishka Bhaduri
View author publications
You can also search for this author in PubMed Google Scholar
Hillol Kargupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kamalika Das.

Additional information

A shorter version of this paper was published in IEEE P2P’09 conference. This work was supported by AFOSR MURI grant 2009-11.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Das, K., Bhaduri, K. & Kargupta, H. Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks. Peer-to-Peer Netw. Appl. 4, 192–209 (2011). https://doi.org/10.1007/s12083-010-0075-1

Download citation

Received: 31 December 2009
Accepted: 03 June 2010
Published: 22 June 2010
Issue Date: June 2011
DOI: https://doi.org/10.1007/s12083-010-0075-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

Abstract

Access this article

Similar content being viewed by others

Stratified random sampling from streaming and stored data

On the nature and types of anomalies: a review of deviations in data

Music Recommendation Systems: Overview and Challenges

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks

Abstract

Access this article

Similar content being viewed by others

Stratified random sampling from streaming and stored data

On the nature and types of anomalies: a review of deviations in data

Music Recommendation Systems: Overview and Challenges

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation