Improve Network Clustering via Diversified Ranking

Sun, Bing-Jie; Shen, Hua-Wei; Cheng, Xue-Qi

doi:10.1007/978-3-319-21786-4_9

Improve Network Clustering via Diversified Ranking

Bing-Jie Sun¹⁶,
Hua-Wei Shen¹⁶ &
Xue-Qi Cheng¹⁶

Conference paper
First Online: 01 January 2015

1262 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9197))

Abstract

Clustering is one fundamental task in network analysis. A widely-used clustering method is k-means clustering, where clustering is iteratively refined by minimizing the distance between each data point and its cluster center. For k-means clustering, one key issue is initialization, which heavily affects its accuracy and computational cost. This issue is particularly critical when applying k-means clustering to graph data where nodes are not embedded in a metric space. In this paper, we propose to use diversified ranking method to initialize k-means clustering, i.e., finding a set of seed nodes. In diversified ranking, seed nodes are figured out by considering their centrality and diversity in a unified manner. With seed nodes as starting points, k-means clustering is used to cluster nodes into groups. We apply the proposed method to detect communities in synthetic network and real-world network. Results indicate that the proposed method exhibits high effectiveness and efficiency.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. J. Reviews of Modern Physics 74(1), 47 (2002)
Article MATH Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. J. Physical Review E 69(2), 026113 (2004)
Article Google Scholar
Shen, H.W., Cheng, X.Q., Guo, J.F.: Exploring the structural regularities in networks. J. Physical Review E 84(5), 056111 (2011)
Article Google Scholar
Gopalan, P.K., Blei, D.M.: Efficient discovery of overlapping communities in massive networks. J. Proceedings of the National Academy if Sciences 110(36), 14534–14539 (2013)
Google Scholar
Sun, B.J., Shen, H.W., Cheng, X.Q.: Detecting overlapping communities in massive networks. J. EPL 108(6), 68001 (2014)
Article Google Scholar
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. J. Proceedings of the National Academy if Sciences 99(12), 7821–7826 (2002)
Google Scholar
McDaid, A., Hurley, N.: Detecting overlapping communities with model-based overlapping seed expansion. In: 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 112–119. IEEE (2010)
Google Scholar
Andersen, R., Lang, K.J.: Communities from seed sets. In: Proceedings of the 15th international Conference on World Wide Web, pp. 223–232. ACM (2006)
Google Scholar
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications. 40, 200–210 (2013)
Article Google Scholar
Arthur, D., Vassilvitskii, S.: k-means++: the advantage of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, Philadelphia (2007)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing ordering to the web. J. (1999)
Google Scholar
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM (1998)
Google Scholar
Mei, Q., Guo, J., Radev, D.: Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD International Conference on on Knowledge Discovery and Data Mining, pp. 1009–1018. ACM (2010)
Google Scholar
Tong, H., He, J., Wen, Z., Konuru, R., Lin, C.Y.: Diversified ranking on large graphs: an optimization viewpoint. In: Proceedings of the 17th ACM SIGKDD International Conference on on Knowledge Discovery and Data Mining, pp. 1028–1036. ACM (2011)
Google Scholar
Sun, Y., Han, J., Zhao, P.: RankClus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 565–576. ACM (2009)
Google Scholar
Sun, Y., Han, J.: Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806. ACM (2009)
Google Scholar
Kücüktunc, O., Saule, E., Kaya, K.: Diversifing citation recommendations. J. ACM Transactions on Intelligent System and Technology (TIST) 5(4), 55 (2014)
Google Scholar
Li, R.H., Yu, J.X.: Scalable diversified ranking on large graphs. IEEE Transactions on J. Knowledge and Data Engineering 25(9), 2133–2146 (2013)
Google Scholar
Cheng, X.Q., Sun, B.J., Shen, H.W., Yu, Z.H.: Research Status and Trends of Diversified Graph Ranking. J. Proceedings of the Chinese Academy of Science 30(2), 248–256 (2015)
Google Scholar
Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 10–17 (2003)
Google Scholar
Zhai, C.X., Lafferty, J.: A risk minimization framework for information retrieval. Information Processing & Management 42(1), 31–55 (2006)
Article MATH Google Scholar
Lin, H., Bilmes, J., Xie, S.: Graph-based submodular selection for extractive summarization. In: Automatic Speech Recognition and Understanding Workshop (2009)
Google Scholar
Zhu, X., Goldberg, A.B., Van Gael, J., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: HLT-NAACL, pp. 97–104 (2007)
Google Scholar
Cheng, X.Q., Du, P., Guo, J.: Ranking on data manifold with sink points. IEEE Transactions on J. Knowledge and Data Engineering 25(1), 177–191 (2013)
Google Scholar
Agichtein, E., Brill, E., Dumais, S.T., et al.: Learning user interaction models for predicting web search result preferences. In: Proc. of SIGIR, pp. 3–10 (2006)
Google Scholar
Lü, L., Zhang, Y.C., Yeung, C.H.: Leaders in social networks, the delicious case. PloS One 6(6), e21202 (2011)
Article Google Scholar
Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–556. ACM (2004)
Google Scholar
Arfken, G.: Ill-Conditioned Systems. Mathematical Methods for Physicists, 3rd edn, pp. 233–234. Academic Press, Orlando (1985)
Google Scholar
Liu, J., Liu, T.: Detecting community structure in complex networks using simulated annealing with k-means algorithms. J. Physica A: Statistical Mechanics and its Applications 389(11), 2300–2309 (2010)
Article Google Scholar
Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. J. Physical Review E 80(1), 016118 (2009)
Article Google Scholar
Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. J. New Journal of Physics 11(3), 033015 (2009)
Article Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., et al.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), P10008 (2008)
Article Google Scholar
Rosvall, M., Bergstorm, C.T.: Maps of random walks on complex networks reveal community structure. J. Proceedings of the National Academy of Sciences 105(4), 1118–1123 (2008)
Article Google Scholar
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. J. Proc. Natl. Acad. Sci. 99, 7821–7826 (2002)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Bing-Jie Sun, Hua-Wei Shen & Xue-Qi Cheng

Authors

Bing-Jie Sun
View author publications
You can also search for this author in PubMed Google Scholar
Hua-Wei Shen
View author publications
You can also search for this author in PubMed Google Scholar
Xue-Qi Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hua-Wei Shen .

Editor information

Editors and Affiliations

Computer and Information Science and Engineering, University of Florida, Gainesville, Florida, USA
My T. Thai
Towsen University, Towson, Maryland, USA
Nam P. Nguyen
Chinese Academy of Sciences, Beijing, China
Huawei Shen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, BJ., Shen, HW., Cheng, XQ. (2015). Improve Network Clustering via Diversified Ranking. In: Thai, M., Nguyen, N., Shen, H. (eds) Computational Social Networks. CSoNet 2015. Lecture Notes in Computer Science(), vol 9197. Springer, Cham. https://doi.org/10.1007/978-3-319-21786-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-21786-4_9
Published: 31 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21785-7
Online ISBN: 978-3-319-21786-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics