Skip to main content

Predicting Graph Operator Output over Multiple Graphs

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11496))

Abstract

A growing list of domains, in the forefront of which are Web data and applications, are modeled by graph representations. In content-driven graph analytics, knowledge must be extracted from large numbers of available data graphs. As the number of datasets (a different type of volume) can reach immense sizes, a thorough evaluation of each input is prohibitively expensive. To date, there exists no efficient method to quantify the impact of numerous available datasets over different graph analytics tasks. To address this challenge, we propose an efficient graph operator modeling methodology. Our novel, operator-agnostic approach focuses on the inputs themselves, utilizing graph similarity to infer knowledge about them. An operator is executed for a small subset of the available inputs and its behavior is modeled for the rest of the graphs utilizing machine learning. We propose a family of similarity measures based on the degree distribution that prove capable of producing high quality models for many popular graph tasks, even compared to modern, state of the art similarity functions. Our evaluation over both real-world and synthetic graph datasets indicates that our method achieves extremely accurate modeling of many commonly encountered operators, managing massive speedups over a brute-force alternative.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/giagiannis/data-profiler.

References

  1. Aherne, F.J., et al.: The bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34(4), 363–368 (1998)

    MathSciNet  MATH  Google Scholar 

  2. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA, pp. 1027–10357–9 January 2007

    Google Scholar 

  3. Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

    Article  MathSciNet  Google Scholar 

  4. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  MathSciNet  Google Scholar 

  5. Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. 35(1), 99–109 (1943)

    MathSciNet  MATH  Google Scholar 

  6. Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)

    Article  Google Scholar 

  7. Bounova, G., de Weck, O.: Overview of metrics and their correlation patterns for multiple-metric topology analysis on heterogeneous graph ensembles. Phys. Rev. E 85, 016117 (2012)

    Article  Google Scholar 

  8. Brandes, U., Erlebach, T.: Network Analysis: Methodological Foundations. Springer, New York (2005). https://doi.org/10.1007/b106453

  9. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 30(1–7), 107–117 (1998)

    Google Scholar 

  10. Csardi, G., Nepusz, T.: The Igraph software package for complex network research. Inter J. Complex Syst. 1695, 1–9 (2006)

    Google Scholar 

  11. Eppstein, D., Wang, J.: Fast approximation of centrality. J. Graph Algorithms Appl. 8, 39–45 (2004)

    Article  MathSciNet  Google Scholar 

  12. da, F., Costa, L., et al.: Characterization of complex networks: a survey of measurements. Adv. Phys. 56(1), 167–242 (2007)

    Google Scholar 

  13. Freeman, L.C.: A set of measures of centrality based on betweenness. Sociometry 40(1), 35–41 (1977)

    Article  Google Scholar 

  14. Gandomi, et al.: Beyond the hype. Int. J. Inf. Manage. 35(2), 137–144 (2015)

    Google Scholar 

  15. Gärtner, T.: A survey of kernels for structured data. SIGKDD 5(1), 49–58 (2003)

    Article  Google Scholar 

  16. Gärtner, T., Flach, P., Wrobel, S.: On graph kernels: hardness results and efficient alternatives. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT-Kernel 2003. LNCS (LNAI), vol. 2777, pp. 129–143. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45167-9_11

    Chapter  MATH  Google Scholar 

  17. Ghosh, S., Das, N., Gonçalves, T., Quaresma, P., Kundu, M.: The journey of graph kernels through two decades. Comput. Sci. Rev. 27, 88–111 (2018)

    Article  MathSciNet  Google Scholar 

  18. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: SIGKDD, pp. 855–864. ACM (2016)

    Google Scholar 

  19. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2009). https://doi.org/10.1007/978-0-387-21606-5

  20. Hernández, J.M., Mieghem, P.V.: Classification of graph metrics, pp. 1–20 (2011)

    Google Scholar 

  21. Jamakovic, A., et al.: Robustness of networks against viruses: the role of the spectral radius. In: Symposium on Communications and Vehicular Technology, pp. 35–38 (2006)

    Google Scholar 

  22. Jamakovic, A., Uhlig, S.: On the relationships between topological measures in real-world networks. NHM 3(2), 345–359 (2008)

    Article  MathSciNet  Google Scholar 

  23. Kaufmann, L., Rousseeuw, P.: Clustering by means of medoids, pp. 405–416 (1987)

    Google Scholar 

  24. Koutra, D., Parikh, A., Ramdas, A., Xiang, J.: Algorithms for graph similarity and subgraph matching. Technical Report Carnegie-Mellon-University (2011). https://people.eecs.berkeley.edu/~aramdas/reports/DBreport.pdf

  25. Leskovec, J., Adamic, L.A., Huberman, B.A.: The dynamics of viral marketing. TWEB 1(1), 5 (2007)

    Article  Google Scholar 

  26. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data

  27. Leskovec, J., Sosič, R.: Snap: a general-purpose network analysis and graph-mining library. ACM Trans. Intell. Syst. Technol. 8(1), 1 (2016)

    Article  Google Scholar 

  28. McAuley, J.J., Leskovec, J.: Learning to discover social circles in ego networks. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, Nevada, United States, pp. 548–556 (2012)

    Google Scholar 

  29. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Article  Google Scholar 

  30. Riondato, M., Kornaropoulos, E.M.: Fast approximation of betweenness centrality through sampling. Data Min. Knowl. Discov. 30(2), 438–475 (2016)

    Article  MathSciNet  Google Scholar 

  31. Sanfeliu, A., Fu, K.: A distance measure between attributed relational graphs for pattern recognition. IEEE Trans. Syst. Man Cybern. 13(3), 353–362 (1983)

    Article  Google Scholar 

  32. Schieber, T.A., et al.: Quantification of network structural dissimilarities. Nat. Commun. 8, 13928 (2017)

    Article  Google Scholar 

  33. Sugiyama, M., Borgwardt, K.M.: Halting in random walk kernels. In: Annual Conference on Neural Information Processing Systems, pp. 1639–1647 (2015)

    Google Scholar 

  34. Bakogiannis, T., Giannakopoulos, I., Tsoumakos, D., Koziris, N.: Graph operator modeling over large graph datasets. CoRR abs/1802.05536 (2018). http://arxiv.org/abs/1802.05536

  35. Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11, 1201–1242 (2010)

    MathSciNet  MATH  Google Scholar 

  36. Zager, L.A., Verghese, G.C.: Graph similarity scoring and matching. Appl. Math. Lett. 21(1), 86–94 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tasos Bakogiannis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bakogiannis, T., Giannakopoulos, I., Tsoumakos, D., Koziris, N. (2019). Predicting Graph Operator Output over Multiple Graphs. In: Bakaev, M., Frasincar, F., Ko, IY. (eds) Web Engineering. ICWE 2019. Lecture Notes in Computer Science(), vol 11496. Springer, Cham. https://doi.org/10.1007/978-3-030-19274-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-19274-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-19273-0

  • Online ISBN: 978-3-030-19274-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics