Skip to main content

Ranking weighted clustering coefficient in large dynamic graphs

Abstract

Efficiently searching top-k representative vertices is crucial for understanding the structure of large dynamic graphs. Recent studies show that communities formed by a vertex with high local clustering coefficient and its neighbours can achieve enhanced information propagation speed as well as disease transmission speed. However, local clustering coefficient, which measures the cliquishness of a vertex in its local neighbourhood, prefers vertices with small degrees. To remedy this issue, in this paper we propose a new ranking measure, weighted clustering coefficient (WCC) of vertices, by integrating both local clustering coefficient and degree. WCC not only inherits the properties of local clustering coefficient but also approximately measures the density (i.e., average degree) of its neighbourhood subgraph. Thus, vertices with higher WCC are more likely to be representative. We study efficiently computing and monitoring top-k representative vertices based on WCC over large dynamic graphs. To reduce the search space, we propose a series of heuristic upper bounds for WCC to prune a large portion of disqualifying vertices from the search space. We also develop an approximation algorithm by utilizing Flajolet-Martin sketch to trade acceptable accuracy for enhanced efficiency. An efficient incremental algorithm dealing with frequent updates in dynamic graphs is explored as well. Extensive experimental results on a variety of real-life graph datasets demonstrate the efficiency and effectiveness of our approaches.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9

Notes

  1. http://snap.stanford.edu

References

  1. Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17, 354–364 (1997)

    MathSciNet  MATH  Google Scholar 

  2. Angel, A., Koudas, N., Sarkas, N., Srivastava, D.: Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB 5(6), 574–585 (2012)

    Google Scholar 

  3. Bahmani, B., Kumar, R., Vassilvitskii, S.: Densest subgraph in streaming and mapreduce. PVLDB 5(5), 454–465 (2012)

    Google Scholar 

  4. Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: KDD, pp 16–24 (2008)

  5. Bonchi, F., Gullo, F., Kaltenbrunner, A., Volkovich, Y.: Core decomposition of uncertain graphs. In: KDD, pp 1316–1325 (2014)

  6. Chan, K.Y.Y., Vitevitch, M.S.: The influence of the phonological neighborhood clustering coefficient on spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform. 35(6), 1934–1949 (2009)

    Article  Google Scholar 

  7. Chu, S., Cheng, J.: Triangle listing in massive networks and its applications. In: KDD, pp. 672–680 (2011)

  8. Coleman, J.S.: Social Capital in the Creation of Human Capital. Am. J. Sociol. 94, S95–S120 (1988)

    Article  Google Scholar 

  9. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: STOC, pp. 1–6 (1987)

  10. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press (2009)

  11. Flajolet, P., Martin, G.N.: Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci. 31, 182–209 (1985)

    MathSciNet  Article  MATH  Google Scholar 

  12. Goyal, A., Lu, W., Lakshmanan, L.V.: Celf++: optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings of the 20th international conference companion on world wide web, pp 47–48. ACM (2011)

  13. Huang, X., Cheng, H., Li, R.-H., Qin, L., Yu, J.X.: Top-k structural diversity search in large networks. Proc. VLDB Endow. 6(13), 1618–1629 (2013)

    Article  Google Scholar 

  14. Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp 1311–1322 (2014)

  15. Huang, X., Lakshmanan, L.V., Yu, J.X., Cheng, H.: Approximate closest community search in networks. Proc. VLDB Endowment 9(4), 276–287 (2015)

    Article  Google Scholar 

  16. Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: Semantics and algorithms (2016)

    Book  Google Scholar 

  17. Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 58, 1–11 (2008)

    Article  Google Scholar 

  18. Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: STOC, pp. 1–10 (1977)

  19. Jha, M., Seshadhri, C., Pinar, A.: A space efficient streaming algorithm for triangle counting using the birthday paradox. In: KDD, pp 589–597 (2013)

  20. Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence through a social network. In: KDD, pp 137–146 (2003)

  21. Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407(1 - 3), 458–473 (2008)

    MathSciNet  Article  MATH  Google Scholar 

  22. Lin, X., Yuan, Y., Zhang, Q., stars, Y. Zhang. Selecting.: The k most representative skyline operator. In: ICDE, pp 86–95 (2007)

  23. Lu, J., Senellart, P., Lin, C., Du, X., Wang, S., Chen, X.: Optimal top-k generation of attribute combinations based on ranked lists. In: SIGMOD, pp 409–420 (2012)

  24. Luo, Y., Lin, X., Wang, W., Zhou, X.: Spark: Top-k keyword query in relational databases. In: SIGMOD, pp. 115–126 (2007)

  25. Olsen, P.W., Labouseur, A.G., Hwang, J.-H: Efficient top-k closeness centrality search IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014. (2014) doi:10.1109/ICDE.2014.6816651

    Google Scholar 

  26. Pfeiffer III, J.J., Neville, J.: Methods to determine node centrality and clustering in graphs with uncertain structure. arXiv preprint arXiv:1104.0319(2011)

  27. Qin, L., Yu, J.X., Chang, L.: Diversifying top-k results. Proc. VLDB Endow. 1124–1135 (2012)

  28. Rubinov, M., Sporns, O.: Complex network measures of brain connectivity: Uses and interpretations. NeuroImage 52(3), 1059–1069 (2010)

    Article  Google Scholar 

  29. Soffer, S.N., Vázquez, A.: Network clustering coefficient without degree-correlation biases. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 71(5) (2005)

  30. Strogatz, S.H.: Exploring complex networks. Nature 6825, 268–276 (2001)

    Article  Google Scholar 

  31. Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: WWW, pp. 607–614 (2011)

  32. Tangwongsan, K., Pavan, A., Tirthapura, S.: Parallel triangle counting in massive streaming graphs. In: CIKM, pp. 781–786 (2013)

  33. L.H.U., Mamoulis, N., Berberich, K., Bedathur, S.: Durable top-k search in document archives. In: SIGMOD, pp. 555–566 (2010)

  34. Wang, H., Li, M., Wang, J., Pan, Y.: A new method for identifying essential proteins based on edge clustering coefficient, pp. 87–98 (2011)

    Google Scholar 

  35. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 409–10 (1998)

    Article  Google Scholar 

  36. Yan, X., He, B., Zhu, F., Han, J.: Top-k aggregation queries over large networks. In: ICDE (2010)

  37. Yu, A., Agarwal, P.K., Yang, J.: Processing a large number of continuous preference top-k queries. In: SIGMOD, pp. 397–408 (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuefei Li.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, X., Chang, L., Zheng, K. et al. Ranking weighted clustering coefficient in large dynamic graphs. World Wide Web 20, 855–883 (2017). https://doi.org/10.1007/s11280-016-0420-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-016-0420-2

Keywords

  • Top-k search
  • Clustering coefficient
  • Large dynamic graphs
  • Node ranking