A Discussion on the Design of Graph Database Benchmarks

Dominguez-Sal, David; Martinez-Bazan, Norbert; Muntes-Mulero, Victor; Baleta, Pere; Larriba-Pey, Josep Lluis

doi:10.1007/978-3-642-18206-8_3

A Discussion on the Design of Graph Database Benchmarks

David Dominguez-Sal¹⁸,
Norbert Martinez-Bazan¹⁸,
Victor Muntes-Mulero¹⁸,
Pere Baleta¹⁹ &
…
Josep Lluis Larriba-Pey¹⁸

Conference paper

1389 Accesses
20 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6417))

Abstract

Graph Database Management systems (GDBs) are gaining popularity. They are used to analyze huge graph datasets that are naturally appearing in many application areas to model interrelated data. The objective of this paper is to raise a new topic of discussion in the benchmarking community and allow practitioners having a set of basic guidelines for GDB benchmarking. We strongly believe that GDBs will become an important player in the market field of data analysis, and with that, their performance and capabilities will also become important. For this reason, we discuss those aspects that are important from our perspective, i.e. the characteristics of the graphs to be included in the benchmark, the characteristics of the queries that are important in graph analysis applications and the evaluation workbench.

The members of DAMA-UPC thank the Ministry of Science and Innovation of Spain and Generalitat de Catalunya, for grant numbers TIN2009-14560-C03-03 and GRC-1087 respectively.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angles, R., Gutiérrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1) (2008)
Google Scholar
Neo4j: The neo database (2006), http://dist.neo4j.org/neo-technology-introduction.pdf
HypergraphDB: HypergraphDB website, http://www.kobrix.com/hgdb.jsp (last retrieved in March 2010)
Infogrid: Blog, http://infogrid.org/blog/2010/03/operations-on-a-graph-database-part-4 (last retrieved in March 2010)
Martínez-Bazan, N., Muntés-Mulero, V., et al.: Dex: high-performance exploration on large graphs for information retrieval. In: CIKM, pp. 573–582 (2007)
Google Scholar
Jena-RDF: Jena documentation, http://jena.sourceforge.net/documentation.html (last retrieved in March 2010)
AllegroGraph: AllegroGraph website, http://www.franz.com/agraph/ (last retrieved in May 2010)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C (2008), http://www.w3.org/TR/rdf-sparql-query/
Gremlin website: Gremlin documentation, http://wiki.github.com/tinkerpop/gremlin/ (last retrieved in June 2010)
Transaction Processing Performance Council (TPC): TPC Benchmark. TPC website, http://www.tpc.org (last retrieved in June 2010)
Cattell, R., Skeen, J.: Object operations benchmark. TODS 17(1), 1–31 (1992)
Article Google Scholar
Carey, M., DeWitt, D., Naughton, J.: The oo7 benchmark. In: SIGMOD Conference, pp. 12–21 (1993)
Google Scholar
Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: VLDB, pp. 974–985 (2002)
Google Scholar
Guo, Y., Pan, Z., Heflin, J.: Lubm: A benchmark for owl knowledge base systems. J. Web Sem. 3(2-3), 158–182 (2005)
Article Google Scholar
Bader, D., Feo, J., Gilbert, J., Kepner, J., Koetser, D., Loh, E., Madduri, K., Mann, B., Meuse, T., Robinson, E.: HPC Scalable Graph Analysis Benchmark v1.0. HPC Graph Analysis (February 2009)
Google Scholar
Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L.: Survey of graph database performance on the hpc scalable graph analysis benchmark. In: Shen, H.T., Pei, J., Özsu, M.T., Zou, L., Lu, J., Ling, T.-W., Yu, G., Zhuang, Y., Shao, J. (eds.) WAIM 2010. LNCS, vol. 6185, pp. 37–48. Springer, Heidelberg (2010)
Chapter Google Scholar
INSNA: International network for social network analysis, http://www.insna.org/
OReilly, T.: What is Web 2.0: Design patterns and business models for the next generation of software (2005)
Google Scholar
Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: from relations to semistructured data and XML. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Brickley, D., Guha, R.V.: Resource description framework (rdf) schema specification 1.0. W3C Candidate Recommendation (2000)
Google Scholar
Shasha, D., Wang, J., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS, pp. 39–52. ACM, New York (2002)
Google Scholar
Anyanwu, K., Sheth, A.: ρ-queries: Enabling querying for semantic associations on the semantic web. In: WWW, pp. 690–699. ACM Press, New York (2003)
Google Scholar
Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Computing Surveys (CSUR) 38(1), 2 (2006)
Article Google Scholar
BioGRID: General repository for interaction datasets, http://www.thebiogrid.org/
PDB: Rcsb protein data bank, http://www.rcsb.org/
NAViGaTOR, http://ophid.utoronto.ca/navigator/
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Article Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Article MathSciNet MATH Google Scholar
Strands: e-commerce recommendation engine, http://recommender.strands.com/
Chein, M., Mugnier, M.: Conceptual graphs: fundamental notions. Revue d’Intelligence Artificielle 6, 365–406 (1992)
Google Scholar
DirectedEdge: a recommendation engine, http://www.directededge.com (last retrieved in June 2010)
Amadeus: Global travel distribution system, http://www.amadeus.net/
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Signed networks in social media. In: CHI, pp. 1361–1370 (2010)
Google Scholar
Goertzel, B.: OpenCog Prime: Design for a Thinking Machine. Online wikibook (2008), http://opencog.org/wiki/OpenCogPrime
Erdos, P., Renyi, A.: On random graphs. Mathematicae 6(290-297), 156 (1959)
MATH Google Scholar
Leskovec, J., Lang, L., Dasgupta, A., Mahoney, M.: Statistical properties of community structure in large social and information networks. In: WWW, pp. 695–704 (2008)
Google Scholar
Flickr: Four Billion, http://blog.flickr.net/en/2009/10/12/4000000000/ (last retrieved in June 2010)
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: SIGCOMM, pp. 251–262 (1999)
Google Scholar
McGlohon, M., Akoglu, L., Faloutsos, C.: Weighted graphs and disconnected components: patterns and a generator. In: KDD, pp. 524–532 (2008)
Google Scholar
Bader, D., Madduri, K.: Parallel algorithms for evaluating centrality indices in real-world networks. In: ICPP, pp. 539–550 (2006)
Google Scholar
Bitton, D., DeWitt, D., Turbyfill, C.: Benchmarking database systems a systematic approach. In: VLDB, pp. 8–19 (1983)
Google Scholar
Transaction Processing Performance Council (TPC): TPC Benchmark H (2.11). TPC website, http://www.tpc.org/tpch/ (last retrieved in June 2010)
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research 11, 985–1042 (2010)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

DAMA-UPC, Barcelona, Spain
David Dominguez-Sal, Norbert Martinez-Bazan, Victor Muntes-Mulero & Josep Lluis Larriba-Pey
Sparsity Technologies, Barcelona, Spain
Pere Baleta

Authors

David Dominguez-Sal
View author publications
You can also search for this author in PubMed Google Scholar
Norbert Martinez-Bazan
View author publications
You can also search for this author in PubMed Google Scholar
Victor Muntes-Mulero
View author publications
You can also search for this author in PubMed Google Scholar
Pere Baleta
View author publications
You can also search for this author in PubMed Google Scholar
Josep Lluis Larriba-Pey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Access and Virtualization Business Unit, Cisco Systems, Inc., 3800 Zankar Road, 95134, San Jose, CA, USA
Raghunath Nambiar
Server Technologies, Oracle Corporation, 500 Oracle Parkway, 94065, Redwood Shores, CA, USA
Meikel Poess

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dominguez-Sal, D., Martinez-Bazan, N., Muntes-Mulero, V., Baleta, P., Larriba-Pey, J.L. (2011). A Discussion on the Design of Graph Database Benchmarks. In: Nambiar, R., Poess, M. (eds) Performance Evaluation, Measurement and Characterization of Complex Systems. TPCTC 2010. Lecture Notes in Computer Science, vol 6417. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18206-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-18206-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18205-1
Online ISBN: 978-3-642-18206-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics