Advertisement

Towards the Next Generation of Large-Scale Network Archives

  • Stijn Heldens
  • Ana Varbanescu
  • Wing Lung Ngai
  • Tim Hegeman
  • Alexandru Iosup
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10104)

Abstract

Both data and computer scientists need graph (network) datasets in the design, comparison, and tuning of important scientific results and practical artifacts. Despite the abundance of data in practice, freely available datasets are usually difficult to access, limited in size and diversity, and are collected in small static archives.

This work presents our vision towards a next generation of graph data archives. Therefore, we formulate six key requirements to guide the design of such archives. We further propose GraphPedia, a prototype architecture that addresses these requirements, and provides a large collection of different graphs, in many different storage formats, rich meta-data, advanced searching, and on-demand graph generation. Once the open implementation challenges are resolved, GraphPedia will become a dynamic meeting space for exchanging graphs.

References

  1. 1.
    Bader, D.A., Kintali, S., Madduri, K., Mihail, M.: Approximating betweenness centrality. In: Bonato, A., Chung, F.R.K. (eds.) WAW 2007. LNCS, vol. 4863, pp. 124–137. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-77004-6_10 CrossRefGoogle Scholar
  2. 2.
    Brandes, U., Eiglsperger, M., Lerner, J., Pich, C.: Graph markup language (GraphML). Citeseer (2010)Google Scholar
  3. 3.
    Chakrabarti, D., Faloutsos, C.: Graph mining: laws, generators, and algorithms. ACM Comput. Surv. 38(1), 2 (2006)CrossRefGoogle Scholar
  4. 4.
    Chebotarev, P.: Studying new classes of graph metrics. In: Nielsen, F., Barbaresco, F. (eds.) GSI 2013. LNCS, vol. 8085, pp. 207–214. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40020-9_21 CrossRefGoogle Scholar
  5. 5.
    Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. (TOMS) 38(1), 1 (2011)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Erdős, P., Rényi, A.: On random graphs i. Publ. Math. Debrecen 6, 290–297 (1959)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Erling, O., Averbuch, A., Larriba-Pey, J., Chafi, H., Gubichev, A., Prat, A., Pham, M.D., Boncz, P.: The LDBC social network benchmark: interactive workload. In: SIGMOD International Conference on Management of Data. ACM (2015)Google Scholar
  8. 8.
    Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: USENIX Symposium on Operating Systems Design and Implementation (2012)Google Scholar
  9. 9.
    Guo, Y., Iosup, A.: The game trace archive. In: 11th Annual Workshop on Network and Systems Support for Games (NetGames) (2012)Google Scholar
  10. 10.
    Hong, S., Depner, S., Manhardt, T., Van Der Lugt, J., Verstraaten, M., Chafi, H.: PGX.D: a fast distributed graph processing engine. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM (2015)Google Scholar
  11. 11.
    Iosup, A., Hegeman, T., Ngai, W., Heldens, S., Prat, A., Manhardt, T., Chafi, H., Capota, M., Sundaram, N., Anderson, M., et al.: LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms. Proc. VLDB Endow. 9(12), 1317–1328 (2016)CrossRefGoogle Scholar
  12. 12.
    Klyne, G., Carroll, J.J.: Resource description framework (RDF): concepts and abstract syntax. Technical report, W3C (2006). http://www.w3.org/TR/rdf-concepts/
  13. 13.
    Kunegis, J.: Konect: the koblenz network collection. In: Proceedings of the 22nd International Conference on World Wide Web Companion, pp. 1343–1350 (2013)Google Scholar
  14. 14.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 177–187. ACM (2005)Google Scholar
  15. 15.
    Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
  16. 16.
    Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World Wide Web, pp. 631–640. ACM (2010)Google Scholar
  17. 17.
    Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. Proc. VLDB Endow. 8(3), 281–292 (2014)CrossRefGoogle Scholar
  18. 18.
    Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.: Challenges in parallel graph processing. Parallel Process. Lett. 17(01), 5–20 (2007)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010)Google Scholar
  20. 20.
    Murphy, R.C., Wheeler, K.B., Barrett, B.W., Ang, J.A.: Introducing the graph 500. Cray Users Group (CUG) (2010)Google Scholar
  21. 21.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report, Stanford InfoLab, November 1999Google Scholar
  22. 22.
    Satish, N., Sundaram, N., Patwary, M.M.A., Seo, J., Park, J., Hassaan, M.A., Sengupta, S., Yin, Z., Dubey, P.: Navigating the maze of graph analytics frameworks using massive graph datasets. In: SIGMOD International Conference on Management of Data, pp. 979–990. ACM (2014)Google Scholar
  23. 23.
    Sundaram, N., Satish, N., Patwary, M.M.A., Dulloor, S.R., Anderson, M.J., Vadlamudi, S.G., Das, D., Dubey, P.: Graphmat: high performance graph analytics made productive. Proc. VLDB Endow. 8(11), 1214–1225 (2015)CrossRefGoogle Scholar
  24. 24.
    Tauro, S.L., Palmer, C., Siganos, G., Faloutsos, M.: A simple conceptual model for the internet topology. In: GLOBECOM Global Telecommunications Conference, vol. 3, pp. 1667–1671. IEEE (2001)Google Scholar
  25. 25.
    Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: Graphx: A resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems. ACM (2013)Google Scholar
  26. 26.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Stijn Heldens
    • 1
  • Ana Varbanescu
    • 2
  • Wing Lung Ngai
    • 1
  • Tim Hegeman
    • 1
  • Alexandru Iosup
    • 1
  1. 1.Delft University of TechnologyDelftThe Netherlands
  2. 2.University of AmsterdamAmsterdamThe Netherlands

Personalised recommendations