Skip to main content

Survey of Graph Database Performance on the HPC Scalable Graph Analysis Benchmark

  • Conference paper
Web-Age Information Management (WAIM 2010)

Abstract

The analysis of the relationship among data entities has lead to model them as graphs. Since the size of the datasets has significantly grown in the recent years, it has become necessary to implement efficient graph databases that can load and manage these huge datasets.

In this paper, we evaluate the performance of four of the most scalable native graph database projects (Neo4j, Jena, HypergraphDB and DEX). We implement the full HPC Scalable Graph Analysis Benchmark, and we test the performance of each database for different typical graph operations and graph sizes, showing that in their current development status, DEX and Neo4j are the most efficient graph databases.

The members of DAMA-UPC thank the Ministry of Science and Innovation of Spain and Generalitat de Catalunya, for grant numbers TIN2009-14560-C03-03 and GRC-1087 respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AllegroGraph. AllegroGraph website, http://www.franz.com/agraph/ (last retrieved in May 2010)

  2. Apache Lucene (September 2008), Lucene website, http://lucene.apache.org/

  3. Bader, D., Feo, J., Gilbert, J., Kepner, J., Koetser, D., Loh, E., Madduri, K., Mann, B., Meuse, T., Robinson, E.: HPC Scalable Graph Analysis Benchmark v1.0. HPC Graph Analysis (February 2009)

    Google Scholar 

  4. Bader, D., Madduri, K.: Parallel algorithms for evaluating centrality indices in real-world networks. In: ICPP, pp. 539–550 (2006)

    Google Scholar 

  5. BerkeleyDB. BerkeleyDB website, http://www.oracle.com/database/berkeley-db/index.html (last retrieved in March 2010)

  6. Brandes, U.: A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25(2), 163–177 (2001)

    Article  MATH  Google Scholar 

  7. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: A recursive model for graph mining. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178. Springer, Heidelberg

    Google Scholar 

  8. Chang, F., Dean, J., Ghemawat, S., et al.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2) (2008)

    Google Scholar 

  9. HypergraphDB. HypergraphDB website, http://www.kobrix.com/hgdb.jsp (last retrieved in March 2010)

  10. Infogrid. Blog, http://infogrid.org/blog/2010/03/operations-on-a-graph-databae-part-4 (last retrieved in March 2010)

  11. Jena-RDF. Jena documentation, http://jena.sourceforge.net/documentation.html (last retrieved in March 2010)

  12. Leskovec, J., Lang, L., Dasgupta, A., Mahoney, M.: Statistical properties of community structure in large social and information networks. In: WWW, pp. 695–704 (2008)

    Google Scholar 

  13. Martínez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., et al.: Dex: high-performance exploration on large graphs for information retrieval. In: CIKM, pp. 573–582 (2007)

    Google Scholar 

  14. Neo4j. The neo database (2006), http://dist.neo4j.org/neo-technology-introduction.pdf

  15. Neo4j. Batch Insert, http://wiki.neo4j.org/content/Batch_Insert (last retrieved in March 2010)

  16. Neo4j. Neo4j wiki documentation, http://wiki.neo4j.org/content/Main_Page (last retrieven in March 2010)

  17. Olson, M., Bostic, K., Seltzer, M.: Berkeley db. In: USENIX Annual Technical Conference, FREENIX Track, pp. 183–191. USENIX (1999)

    Google Scholar 

  18. Sesame. Open RDF website, http://www.openrdf.org/ (last retrieved in May 2010)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L. (2010). Survey of Graph Database Performance on the HPC Scalable Graph Analysis Benchmark. In: Shen, H.T., et al. Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16720-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16720-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16719-5

  • Online ISBN: 978-3-642-16720-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics