Advertisement

Learning and Scaling Directed Networks via Graph Embedding

  • Mikhail DrobyshevskiyEmail author
  • Anton Korshunov
  • Denis Turdakov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10534)

Abstract

Reliable evaluation of network mining tools implies significance and scalability testing. This is usually achieved by picking several graphs of various size from different domains. However, graph properties and thus evaluation results could be dramatically different from one domain to another. Hence the necessity of aggregating results over a multitude of graphs within each domain.

The paper introduces an approach to automatically learn features of a directed graph from any domain and generate similar graphs while scaling input graph size with a real-valued factor. Generating multiple graphs with similar size allows significance testing, while scaling graph size makes scalability evaluation possible. The proposed method relies on embedding an input graph into low-dimensional space, thus encoding graph features in a set of node vectors. Edge weights and node communities could be imitated as well in optional steps.

We demonstrate that embedding-based approach ensures variability of synthetic graphs while keeping degree and subgraphs distributions close to the original graphs. Therefore, the method could make significance and scalability testing of network algorithms more reliable without the need to collect additional data. We also show that embedding-based approach preserves various features in generated graphs which can’t be achieved by other generators imitating a given graph.

Keywords

Random graph generating Graph embedding Representation learning 

Notes

Acknowledgements

This research was collaborated with and supported by Huawei Technologies Co.,Ltd. under contract YB2015110136.

We are also thankful to Ilya Kozlov and Sergey Bartunov for their ideas and valuable contributions.

References

  1. 1.
    Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74(1), 47 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Bordino, I., Donato, D., Gionis, A., Leonardi, S.: Mining large networks with subgraph counting. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 737–742. IEEE (2008)Google Scholar
  3. 3.
    Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: a recursive model for graph mining. In: SDM, vol. 4, pp. 442–446. SIAM (2004)Google Scholar
  4. 4.
    Chykhradze, K., Korshunov, A., Buzun, N., Pastukhov, R., Kuzyurin, N., Turdakov, D., Kim, H.: Distributed generation of billion-node social graphs with overlapping community structure. In: Contucci, P., Menezes, R., Omicini, A., Poncela-Casasnovas, J. (eds.) Complex Networks V. SCI, vol. 549, pp. 199–208. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-05401-8_19 CrossRefGoogle Scholar
  5. 5.
    Cohen, R., Havlin, S.: Scale-free networks are ultrasmall. Phys. Rev. Lett. 90, 058701 (2003)CrossRefGoogle Scholar
  6. 6.
    Dorogovtsev, S.N., Goltsev, A.V., Mendes, J.F.F.: Pseudofractal scale-free web. Phys. Rev. E 65, 066122 (2002)CrossRefGoogle Scholar
  7. 7.
    Drobyshevskiy, M., Korshunov, A., Turdakov, D.: Parallel modularity computation for directed weighted graphs with overlapping communities. Proc. Inst. Syst. Program. 28(6), 153–170 (2016)CrossRefGoogle Scholar
  8. 8.
    Erdos, P., Rényi, A.: On the evolution of random graphs. Publ. Math. Inst. Hungar. Acad. Sci. 5, 17–61 (1960)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: AISTATS, vol. 1, pp. 6 (2010)Google Scholar
  10. 10.
    Ivanov, O.U., Bartunov, S.O.: Learning representations in directed networks. In: Khachay, M.Y., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) AIST 2015. CCIS, vol. 542, pp. 196–207. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-26123-2_19 CrossRefGoogle Scholar
  11. 11.
    Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., Boguná, M.: Hyperbolic geometry of complex networks. Phys. Rev. E 82(3), 036106 (2010)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80(1), 016118 (2009)CrossRefGoogle Scholar
  13. 13.
    Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statistically significant communities in networks. PLoS ONE 6(4), e18961 (2011)CrossRefGoogle Scholar
  14. 14.
    Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: an approach to modeling networks. J. Mach. Learn. Res. 11(Feb), 985–1042 (2010)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Leskovec, J., Krevl, A.: SNAP datasets: stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
  16. 16.
    Leskovec, J., Sosič, R.: Snap: a general-purpose network analysis and graph-mining library. ACM Trans. Intell. Syst. Technol. (TIST) 8(1), 1 (2016)CrossRefGoogle Scholar
  17. 17.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  18. 18.
    Mossel, E., Neeman, J., Sly, A.: Stochastic block models and reconstruction. arXiv preprint arXiv:1202.1499 (2012)
  19. 19.
    Nanavati, A.A., Gurumurthy, S., Das, G., Chakraborty, D., Dasgupta, K., Mukherjea, S., Joshi, A.: On the structural properties of massive telecom call graphs: findings and implications. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, CIKM 2006, pp. 435–444, New York, NY, USA. ACM (2006)Google Scholar
  20. 20.
    Nickel, C.L.M.: Random dot product graphs: a model for social networks, vol. 68 (2007)Google Scholar
  21. 21.
    Palla, G., Lovász, L., Vicsek, T.: Multifractal network generator. Proc. Nat. Acad. Sci. 107(17), 7640–7645 (2010)CrossRefGoogle Scholar
  22. 22.
    Pavlopoulos, G.A., Secrier, M., Moschopoulos, C.N., Soldatos, T.G., Kossida, S., Aerts, J., Schneider, R., Bagos, P.G.: Using graph theory to analyze biological networks. BioData Min. 4(1), 10 (2011)CrossRefGoogle Scholar
  23. 23.
    Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)Google Scholar
  24. 24.
    Staudt, C.L., Hamann, M., Safro, I., Gutfraind, A., Meyerhenke, H.: Generating scaled replicas of real-world complex networks. arXiv preprint arXiv:1609.02121 (2016)
  25. 25.
    Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. ACM (2015)Google Scholar
  26. 26.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of small-worldnetworks. Nature 393(6684), 440–442 (1998)CrossRefzbMATHGoogle Scholar
  27. 27.
    Wegner, A., et al.: Random graphs with motifs (2011)Google Scholar
  28. 28.
    Winkler, M., Reichardt, J.: Motifs in triadic random graphs based on steiner triple systems. Phys. Rev. E 88(2), 022805 (2013)CrossRefGoogle Scholar
  29. 29.
    Ying, X., Wu, X.: Graph generation with prescribed feature constraints. In: SDM, vol. 9, pp. 966–977. SIAM (2009)Google Scholar
  30. 30.
    Young, S.J., Scheinerman, E.R.: Random dot product graph models for social networks. In: Bonato, A., Chung, F.R.K. (eds.) WAW 2007. LNCS, vol. 4863, pp. 138–149. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-77004-6_11 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Mikhail Drobyshevskiy
    • 1
    Email author
  • Anton Korshunov
    • 1
  • Denis Turdakov
    • 1
  1. 1.Institute for System Programming of Russian Academy of SciencesMoscowRussia

Personalised recommendations