Privacy-Preserving EM Algorithm for Clustering on Social Network
We consider the clustering problem in a private social network, in which all vertices are independent and private, and each of them knows nothing about vertices other than itself and its neighbors. Many clustering methods for networks have recently been proposed. Some of these works have dealt with a mixed network of assortative and disassortative models. These methods have been based on the fact that the entire structure of the network is observable. However, entities in real social network may be private and thus cannot be observed. We propose a privacy-preserving EM algorithm for clustering on distributed networks that not only deals with the mixture of assortative and disassortative models but also protects the privacy of each vertex in the network. In our solution, each vertex is treated as an independent private party, and the problem becomes an n-party privacy-preserving clustering, where n is the number of vertices in the network. Our algorithm does not reveal any intermediate information through its execution. The total running time is only related to the number of clusters and the maximum degree of the network but this is nearly independent of the total vertex number.
Unable to display preview. Download preview PDF.
- 2.Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: The 14th ACM Conference on Computer and Communications Security (2007)Google Scholar
- 3.Koller, D., Pfeffer, A.: Probabilistic frame-based systems. In: The 15th National Conference on Artificial Intelligence (1998)Google Scholar
- 4.Jha, S., Kruger, L., McDamiel, P.: Privacy preserving clustering. In: The 10th European Symposium on Research in Computer Security (2005)Google Scholar
- 5.Vaidya, J., Clifton, C.: Privacy-Preserving k-means clustering over vertically partitioned data. In: The 9th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2003)Google Scholar
- 6.Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: The 11th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2003)Google Scholar
- 7.Lin, X., Clifton, C., Zhu, M.: Privacy-preserving clustering with distributed EM mixture. Knowledge and Information Systems, 68–81 (2004)Google Scholar
- 8.Hay, M., Li, C., Miklau, G., Jensen, D.: Accurate estimation of the degree distribution of private networks. In: The 9th IEEE International Conference on Data Mining (2009)Google Scholar
- 9.Liu, K., Terzi, E.: A framework for computing the privacy scores of users in online social networks. In: The 9th IEEE International Conference on Data Mining (2009)Google Scholar
- 10.Sakuma, J., Kobayashi, S.: Link analysis for private weighted graphs. In: The 32nd ACM SIGIR Conference (2009)Google Scholar
- 12.Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)Google Scholar
- 13.Kantarcoglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. In: The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, DMKD 2002 (2002)Google Scholar
- 16.Krebs, V.: http://www.orgnet.com/