World Wide Web

, Volume 20, Issue 5, pp 855–883 | Cite as

Ranking weighted clustering coefficient in large dynamic graphs

  • Xuefei LiEmail author
  • Lijun Chang
  • Kai Zheng
  • Zi Huang
  • Xiaofang Zhou


Efficiently searching top-k representative vertices is crucial for understanding the structure of large dynamic graphs. Recent studies show that communities formed by a vertex with high local clustering coefficient and its neighbours can achieve enhanced information propagation speed as well as disease transmission speed. However, local clustering coefficient, which measures the cliquishness of a vertex in its local neighbourhood, prefers vertices with small degrees. To remedy this issue, in this paper we propose a new ranking measure, weighted clustering coefficient (WCC) of vertices, by integrating both local clustering coefficient and degree. WCC not only inherits the properties of local clustering coefficient but also approximately measures the density (i.e., average degree) of its neighbourhood subgraph. Thus, vertices with higher WCC are more likely to be representative. We study efficiently computing and monitoring top-k representative vertices based on WCC over large dynamic graphs. To reduce the search space, we propose a series of heuristic upper bounds for WCC to prune a large portion of disqualifying vertices from the search space. We also develop an approximation algorithm by utilizing Flajolet-Martin sketch to trade acceptable accuracy for enhanced efficiency. An efficient incremental algorithm dealing with frequent updates in dynamic graphs is explored as well. Extensive experimental results on a variety of real-life graph datasets demonstrate the efficiency and effectiveness of our approaches.


Top-k search Clustering coefficient Large dynamic graphs Node ranking 


  1. 1.
    Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17, 354–364 (1997)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Angel, A., Koudas, N., Sarkas, N., Srivastava, D.: Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB 5(6), 574–585 (2012)Google Scholar
  3. 3.
    Bahmani, B., Kumar, R., Vassilvitskii, S.: Densest subgraph in streaming and mapreduce. PVLDB 5(5), 454–465 (2012)Google Scholar
  4. 4.
    Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: KDD, pp 16–24 (2008)Google Scholar
  5. 5.
    Bonchi, F., Gullo, F., Kaltenbrunner, A., Volkovich, Y.: Core decomposition of uncertain graphs. In: KDD, pp 1316–1325 (2014)Google Scholar
  6. 6.
    Chan, K.Y.Y., Vitevitch, M.S.: The influence of the phonological neighborhood clustering coefficient on spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform. 35(6), 1934–1949 (2009)CrossRefGoogle Scholar
  7. 7.
    Chu, S., Cheng, J.: Triangle listing in massive networks and its applications. In: KDD, pp. 672–680 (2011)Google Scholar
  8. 8.
    Coleman, J.S.: Social Capital in the Creation of Human Capital. Am. J. Sociol. 94, S95–S120 (1988)CrossRefGoogle Scholar
  9. 9.
    Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: STOC, pp. 1–6 (1987)Google Scholar
  10. 10.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press (2009)Google Scholar
  11. 11.
    Flajolet, P., Martin, G.N.: Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci. 31, 182–209 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Goyal, A., Lu, W., Lakshmanan, L.V.: Celf++: optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings of the 20th international conference companion on world wide web, pp 47–48. ACM (2011)Google Scholar
  13. 13.
    Huang, X., Cheng, H., Li, R.-H., Qin, L., Yu, J.X.: Top-k structural diversity search in large networks. Proc. VLDB Endow. 6(13), 1618–1629 (2013)CrossRefGoogle Scholar
  14. 14.
    Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp 1311–1322 (2014)Google Scholar
  15. 15.
    Huang, X., Lakshmanan, L.V., Yu, J.X., Cheng, H.: Approximate closest community search in networks. Proc. VLDB Endowment 9(4), 276–287 (2015)CrossRefGoogle Scholar
  16. 16.
    Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: Semantics and algorithms (2016)CrossRefGoogle Scholar
  17. 17.
    Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 58, 1–11 (2008)CrossRefGoogle Scholar
  18. 18.
    Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: STOC, pp. 1–10 (1977)Google Scholar
  19. 19.
    Jha, M., Seshadhri, C., Pinar, A.: A space efficient streaming algorithm for triangle counting using the birthday paradox. In: KDD, pp 589–597 (2013)Google Scholar
  20. 20.
    Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence through a social network. In: KDD, pp 137–146 (2003)Google Scholar
  21. 21.
    Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407(1 - 3), 458–473 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Lin, X., Yuan, Y., Zhang, Q., stars, Y. Zhang. Selecting.: The k most representative skyline operator. In: ICDE, pp 86–95 (2007)Google Scholar
  23. 23.
    Lu, J., Senellart, P., Lin, C., Du, X., Wang, S., Chen, X.: Optimal top-k generation of attribute combinations based on ranked lists. In: SIGMOD, pp 409–420 (2012)Google Scholar
  24. 24.
    Luo, Y., Lin, X., Wang, W., Zhou, X.: Spark: Top-k keyword query in relational databases. In: SIGMOD, pp. 115–126 (2007)Google Scholar
  25. 25.
    Olsen, P.W., Labouseur, A.G., Hwang, J.-H: Efficient top-k closeness centrality search IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014. (2014) doi: 10.1109/ICDE.2014.6816651 Google Scholar
  26. 26.
    Pfeiffer III, J.J., Neville, J.: Methods to determine node centrality and clustering in graphs with uncertain structure. arXiv preprint arXiv:1104.0319(2011)
  27. 27.
    Qin, L., Yu, J.X., Chang, L.: Diversifying top-k results. Proc. VLDB Endow. 1124–1135 (2012)Google Scholar
  28. 28.
    Rubinov, M., Sporns, O.: Complex network measures of brain connectivity: Uses and interpretations. NeuroImage 52(3), 1059–1069 (2010)CrossRefGoogle Scholar
  29. 29.
    Soffer, S.N., Vázquez, A.: Network clustering coefficient without degree-correlation biases. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 71(5) (2005)Google Scholar
  30. 30.
    Strogatz, S.H.: Exploring complex networks. Nature 6825, 268–276 (2001)CrossRefGoogle Scholar
  31. 31.
    Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: WWW, pp. 607–614 (2011)Google Scholar
  32. 32.
    Tangwongsan, K., Pavan, A., Tirthapura, S.: Parallel triangle counting in massive streaming graphs. In: CIKM, pp. 781–786 (2013)Google Scholar
  33. 33.
    L.H.U., Mamoulis, N., Berberich, K., Bedathur, S.: Durable top-k search in document archives. In: SIGMOD, pp. 555–566 (2010)Google Scholar
  34. 34.
    Wang, H., Li, M., Wang, J., Pan, Y.: A new method for identifying essential proteins based on edge clustering coefficient, pp. 87–98 (2011)Google Scholar
  35. 35.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 409–10 (1998)CrossRefGoogle Scholar
  36. 36.
    Yan, X., He, B., Zhu, F., Han, J.: Top-k aggregation queries over large networks. In: ICDE (2010)Google Scholar
  37. 37.
    Yu, A., Agarwal, P.K., Yang, J.: Processing a large number of continuous preference top-k queries. In: SIGMOD, pp. 397–408 (2012)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of Information Technology and Electrical EngineeringThe University of QueenslandSt. LuciaAustralia
  2. 2.School of Computer Science and EngineeringThe University of New South WalesSydneyAustralia

Personalised recommendations