Abstract
Distributed training allows individual entities to benefit from the training sets owned by other entities. Nevertheless, distributed training also causes serious privacy concerns. Hence, it is highly important to protect privacy in distributed training. In this paper, we study the privacy protection in distributed training of neural network ensembles. We design a privacy-preserving distributed algorithm for training neural network ensembles using AdaBoost.M2. We also analyze the security and complexity of our algorithm. Furthermore, we perform experiments on two data sets of the UCI repository to verify the algorithm’s effectiveness and efficiency.
Similar content being viewed by others
Notes
For the ease of description, we write h t (x i , y) as h t (i, y) in the entire paper.
In an asymmetric cryptographic scheme, keys are generated in pairs (K +, K−). A published public key K+ is used for encryption and a private key K− is used for decryption.
Note this scaling operation can be easily implemented in the privacy-preserving training algorithm proposed in paper [6] since the authors use linear functions to estimate the sigmoid function.
For the sakes of clarity and accurateness, we omit the bagging line and also add test cases of ensembles with 4, 6, 8 component networks.
Here we assume that the exhaustive search is performed using the same searching order (i.e., increasing order and decreasing order) everytime.
References
Flouri K, Beferull-Lozano B, Tsakalides P (2006) Training a SVM-based classifier in distributed sensor networks. In: European signal processing conference, Florence, Italy, Sep
Stolfo SJ, Prodromidis AL, Tselepis S, Lee W, Fan DW, Chan PK (1997) JAM: java agent for meta-learning over distributed databases. In: Proceedings of ACM SIGKDD international conference on knowledge discovery data mining, pp 74–81
Navia-Vázquez A, Gutiérrez-González D, Parrado-Hernández E, Navarro-Abellán JJ (2006) Distributed support vector machines. IEEE Trans Neural Netw 17(4):1091–1097
Samet S, Miri A (2008) Privacy-perserving protocols for perceptron learning algorithm in neural networks. In: 2008 4th international IEEE conference on intelligent system
Secretan J, Georgiopoulos M, Castro J (2007) A privacy preserving probabilistic neural network for horizontally partitioned databases. In: Proceedings of international joint conference on neural networks, Orlando, FL, USA
Chen T, Zhong S (2009) Privacy-preserving backpropagation neural network learning. IEEE Trans Neural Netw 20(10):1554–1564
Goh WY, Lim CP, Peh KK (2003) Predicting drug dissolution profiles with an ensemble of boosted neural networks: a time series approach. IEEE Trans Neural Netw 14(2):459–463
Sharkey AJ (1999) Combining artificial neural nets: ensemble and modular multi-net systems. Springer-Verlag New York, Inc., Secaucus, NJ
Sollich P, Krogh A (1996) Learning with ensembles: how over-fitting can be useful. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances, in neural information processing systems 8, Denvor, CO, MIT Press, Cambridge, MA, pp 190–196
Hansen LK, Liisberg L, Salamon P (1992) Ensemble methods for handwritten digit recognition. In: Proceedings of IEEE workship on neural network for signal processing, Helsingoer, Denmark, IEEE Press, Piscataway, NJ, pp 333–342
West D, Dellana S, Qian J (2005) Neural network ensemble strategies for financial decision applications. Comput Oper Res 32:2543–2559
Tsai CF, Wu JW (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34:2639–2649
Zhou ZH, Jiang Y, Yang YB, Chen SF (2002) Lung cancer cell identification based on artificial neural network ensembles. Artif Intell Med 24(1):25–36
Das R, Turkoglu I, Sengur A (2009) Diagnosis of valvular heart disease through neural networks ensembles. Comput Methods Programs Biomed 93(2):185–191
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Freund Y, Schapire RE (1997) A decision theoretic generalization of online learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, Morgan Kaufman, San Francisco, pp 148–156
Goldreich O (2001) Foundations of cryptography, vols 1 and 2. Cambridge University Press, Cambridge
Goethals B, Laur S, Lipmma H, Mielikäinen T (2004) On private scalar product computation for privacy-preserving data mining. In: Park C, Chee S (eds) Information security and cryptology—ICISC 2004, vol 3506 of lecture notes in computer science. Springer, Berlin, pp 104–120
Zhong S (2007) Privacy-preserving algorithms for distributed mining of frequent itemsets. Inf Sci 17(2):490–503
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: EUROCRYPT, pp 223–238
ElGamal T (1985) A public-key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans Inf Theory IT-31(4):469–472
Boneh D (1998) The decision Diffie-Hellman problem. In: Proceedings of 3rd algorithmic number theory symposium, pp 48–63
Barni M, Orlandi C, Piva A (2006) A privacy-preserving protocol for neural-network-based computation. In: Proceedings of 8th workshop multimedia security, New York, pp 146–151
Orlandi C, Piva A, Barni M (2007) Oblivious neural network computing via homomorphic encryption. EURASIP J Inf Secur 2007:18:1–18:10
Wan L, Ng WK, Lee VCS (2007) Privacy-preservation for gradient descent methods. In: Proceedings of ACM SIGKDD international conference on knowledge discovery data mining, pp 775–783
Abramowitz M, Stegun IA (1970) Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover Publications, New York (Ninth printing)
Blake CL, Merz CJ (1998) UCI repository of machine learning database. University of California, Department of Information and Computer Science, Irvine, CA [Online]. Available:http://www.ics.uci.edu/mlearn/MLRepository.htm
Dai W (2010) The Crypto++ library 5.6.0. http://www.cryptopp.co
Lazarevic A, Obradovic Z (2001) The distributed boosting algorithm. In: Proceedings of 7th international conference on knowledge discovery data mining, pp 311–316
Lazarevic A, Obradovic Z (2002) Boosting algorithms for parallel and distributed learning. Distrib Parallel Databases 11(2):203–229
Fan W, Stolfo S, Zhang J (1999) The application of AdaBoost for distributed, scalable and on-line learning. In: Proceedings of 5th international conference on knowledge discovery data mining, pp 362–366
Gambs S, Kégl B, Aïmeur E (2007) Privacy-preserving boosting. Data Min Knowl Disc 14:131–170
Agrawal D, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of ACM SIGMOD, pp 439–450
Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: Lecture notes in computer science, vol 1880. Springer, Berlin, pp 36–44
Du W, Zhan Z (2003) Using randomized response techniques for privacy-preserving data mining. In: KDD
Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of international conference on data mining, pp 589–592
Yang Z, Zhong S, Wright R (2005) Privacy-preserving classification of customer data without loss of accuracy. In: 2005 SIAM international conference on data mining(SDM2005)
Wright R, Yang Z (2004) Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. In: Proceedings of 10th ACM SIGKDD international conference on knowledge discovery data mining, pp 713–718
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, Y., Zhong, S. A privacy-preserving algorithm for distributed training of neural network ensembles. Neural Comput & Applic 22 (Suppl 1), 269–282 (2013). https://doi.org/10.1007/s00521-012-1000-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1000-8