Abstract
Data mining has been a popular research area for more than a decade because of its ability of efficiently extracting statistics and trends from large sets of data. However, there are many applications where the data set are distributed among different parties. This makes the privacy an issue of concern for each individual/organization. This paper makes an approach towards privacy preserving clustering problem for vertically partitioned data set(VPD). We propose a secure hierarchical clustering algorithm for two parties over vertically partitioned data set with accuracy measure. Each site only learns the final results about the clusters, but nothing about the individual’s data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn., pp. 383–460. China Machine Press, Beijing (2006)
Vaidya, J.: Privacy Preserving Data Mining Over Vertically Partitioned Data. Ph.D Thesis, Purdue University, pp. 1–149 (2004)
Vaidya, J., Clifton, C.: Privacy Preserving K-Means Clustering over Vertically Partitioned Data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206–215. ACM, Washington, DC (2003)
Yu, T.-K., Lee, D.T., Chang, S.-M., Zhan, J.: Multi-Party k-Means Clustering with Privacy Consideration. In: International Symposium on Parallel and Distributed Processing with Applications, pp. 200–207. IEEE Computer Society (2010)
Jagannathan, G., Wright, R.: Privacy Preserving Distributed k-Means Clustering over Arbitrarily Partitioned Data. In: Proceedings of the 11th ACM, SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1–7. ACM, USA (2005)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R.: A New Privacy Preserving Distributed k-Clustering Algorithm. In: Proc. of the 6th SIAM International Conference on Data Mining, pp. 492–496. SIAM (2006)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R., Umano, D.: Communication-Efficient Privacy-Preserving Clustering. Transactions on Data Privacy 3(1), 1–25 (2010)
Estivill-Castro, V.: Private Representative-Based Clustering for Vertically Partitioned Data. In: Proceedings of the Fifth Mexican International Conference in Computer Science (ENC 2004), pp. 1–8. IEEE Computer Society (2004)
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, pp. 439–450. ACM (2000)
Bunn, P., Ostrovsky, R.: Secure Two-Party k-Means Clustering. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 486–497. ACM (2007)
Jha, S., Kruger, L., McDaniel, P.: Privacy Preserving Clustering. In: 10th European Symp. on Research in Computer Security, pp. 397–417 (2005)
Prasad, P.K., Pandu Rangan, C.: Privacy Preserving BIRCH Algorithm for Clustering over Vertically Partitioned Databases. In: Jonker, W., Petković, M. (eds.) SDM 2006. LNCS, vol. 4165, pp. 84–99. Springer, Heidelberg (2006)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007), http://www.ics.uci.edurmlearnIMLRepository.html
Tripathy, A., De, I.: Privacy Preserving Two-Party Hierarchical Clustering Over Vertically Partitioned Dataset. A Journal of Software Engineering and Applications 6, 26–31 (2013)
Davies David, L., Bouldin Donald, W.: A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1(2), 224–227 (1979)
Dunn, J.: Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4(1), 95–104 (1974)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
De, I., Tripathy, A. (2014). A Secure Two Party Hierarchical Clustering Approach for Vertically Partitioned Data Set with Accuracy Measure. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds) Recent Advances in Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 235. Springer, Cham. https://doi.org/10.1007/978-3-319-01778-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-01778-5_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01777-8
Online ISBN: 978-3-319-01778-5
eBook Packages: EngineeringEngineering (R0)