Skip to main content
Log in

A distributed privacy-preserving regularization network committee machine of isolated Peer classifiers for P2P data mining

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

For distributed data mining in peer-to-peer systems this work describes a completely asynchronous, scalable and privacy-preserving committee machine. Regularization neural networks are used for all the Peer classifiers and the combiner committee in an embedded architecture. The proposed method builds the committee machine using the large amounts of training data distributed over the peers, without moving the data, and with little centralized coordination. At the end of the training phase no Peer will know anything else besides its own local data. This privacy-preserving obligation is a challenging problem for trainable combiners but is crucial in real world applications. Only classifiers are transmitted to other peers to validate their data and send back average accuracy rates in a classical asynchronous peer-to-peer execution cycle. Here the validation set for one classifier becomes the training set of the other and vice versa. From this entirely distributed and privacy-preserving mutual validation a coarse-grained asymmetric mutual validation matrix can be formed to map all Peer members. We demonstrate here that it is possible to exploit this matrix to efficiently train another regularization network as the combiner committee machine.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aggarwal CC, Yu PS (2008) Privacy-preserving data mining: models and algorithms. Kluwer Academic Publishers, New York

    Book  Google Scholar 

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    Google Scholar 

  • Bottou L, Chapelle O, DeCoste D, Weston J (2007) Large scale kernel machines. Neural information processing series. MIT Press, Cambridge

    Google Scholar 

  • Breiman L (1999) Combining predictors. In: Sharkey AJC (ed) Combining artificial neural nets: ensemble and modular multinet systems. Springer, Berlin, pp 31–50

    Google Scholar 

  • Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu M (2003) Tools for privacy preserving distributed data mining. ACM SIGKDD Explor 4(2):1–7

    Google Scholar 

  • Datta S, Bhaduri K, Giannella C, Wolff R, Kargupta H (2006) Distributed data mining in peer-to-peer networks. IEEE Internet Comput 10(4):18–26

    Article  Google Scholar 

  • Drucker H (1997) Fast committee machines for regression and classification. In: KDD-97 proceedings

  • Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. Adv Comput Math 13:1–50

    Article  MathSciNet  MATH  Google Scholar 

  • Girosi F, Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7:21–269

    Article  Google Scholar 

  • Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12:993–1001

    Article  Google Scholar 

  • Hashem S (1997) Optimal linear combinations of neural networks. Neural Netw 10(4):599–614

    Article  MathSciNet  Google Scholar 

  • Hussain I, Irakleous M, Siddiqi MA, Saraee M (2010) Privacy-preserving data mining in peer to peer networks. In: Proceedings of annual international conference on data analysis, data quality & metadata management (DAMD 2010), 14–15 June 2010, Singapore

  • Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37

    Article  Google Scholar 

  • Kantarcioglu M, Vaidya J (2003) Privacy-preserving Naive Bayes classifier for horizontally partitioned data. In: Proceedings of IEEE workshop on privacy-preserving data mining

  • Kargupta H, Sivakumar K (2004) Existential pleasures of distributed data mining. Data mining: next generation challenges and future directions. AAAI/MIT Press, Cambridge

    Google Scholar 

  • Kashima H, Ide T, Kato T, Sugiyama M (2009) Recent advances and trends in large-scale Kernel methods. IEICE Trans Inf Syst E92–D(7):1338–1353

    Article  Google Scholar 

  • Kokkinos Y, Margaritis K (2012) A Regularization Network committee machine of isolated Regularization Networks for distributed privacy preserving data mining. In: Iliadis L et al (eds) AIAI 2012. IFIP AICT 381, pp 97–106

  • Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation and active learning. In: Tesauro G, Touretzky DS, Leen TK (eds) Advances in neural information processing systems (7). MIT Press, Cambridge, MA, pp 231–238

  • Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley Interscience, Hoboken

    Book  Google Scholar 

  • Perrone MP, Cooper LN (1993) When networks disagree: ensemble method for neural networks. In: Mammone RJ (ed) Neural networks for speech and image processing. Chapman & Hall, Boca Raton

    Google Scholar 

  • Poggio T, Girosi F (1990) Regularization algorithms for learning that are equivalent to multilayer networks. Science 247:978–982

    Article  MathSciNet  MATH  Google Scholar 

  • Poggio T, Smale S (2003) The mathematics of learning: dealing with data. Notices Am Math Soc 50(5):537–544

    MathSciNet  MATH  Google Scholar 

  • Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33:1–39

    Article  Google Scholar 

  • Seni G, Elder J (2010) Ensemble Methods in Data Mining. Morgan & Claypool publishers, San Rafael

    Google Scholar 

  • Tresp V (2002) Committee machines. In: Hu YH, Hwang JN (eds) Handbook of neural network signal processing. CRC Press LLC, Boca Raton, pp 122–141

    Google Scholar 

  • Wang L, Fu X (2005) Data mining with computational intelligence. Springer, Berlin

    MATH  Google Scholar 

  • Wilson G (1995) Parallel programming for scientists and engineers. MIT Press, Cambridge

    Google Scholar 

  • Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  MathSciNet  Google Scholar 

  • Wu X (2011) Research on privacy preservation in P2P systems. Int J Adv Comput Technol 3(8):324–330

    Google Scholar 

  • Xiong L, Chitti S, Liu L (2006) k nearest neighbour classification across multiple private databases. In: Proceedings of the ACM fifteenth conference on information and knowledge management, 5–11 November, 2006

  • Yi X, Zhang Y (2009) Privacy-preserving naïve Bayes classification on distributed data via semi-trusted mixers. Inf Syst 34(3):371–380

    Article  Google Scholar 

  • Yu H, Jiang X, Vaidya J (2006) Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In: Proceedings of SAC conference

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their useful suggestions that help on improving the presentation and clarity of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiannis Kokkinos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kokkinos, Y., Margaritis, K.G. A distributed privacy-preserving regularization network committee machine of isolated Peer classifiers for P2P data mining. Artif Intell Rev 42, 385–402 (2014). https://doi.org/10.1007/s10462-013-9418-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-013-9418-7

Keywords

Navigation