Highspeed Graph Processing Exploiting Main-Memory Column Stores

  • Matthias Hauck
  • Marcus Paradies
  • Holger Fröning
  • Wolfgang Lehner
  • Hannes Rauhe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9523)

Abstract

A popular belief in the graph database community is that relational database management systems are generally ill-suited for efficient graph processing. This might apply for analytic graph queries performing iterative computations on the graph, but does not necessarily hold true for short-running, OLTP-style graph queries. In this paper we argue that, instead of extending a graph database management system with traditional relational operators—predicate evaluation, sorting, grouping, and aggregations among others—one should consider adding a graph abstraction and graph-specific operations, such as graph traversals and pattern matching, to relational database management systems. We use an exemplary query from the interactive query workload of the ldbc social network benchmark and run it against our enhanced in-memory, columnar relational database system to support our claims. Our performance measurements indicate that a columnar rdbms—extended by graph-specific operators and data structures—can serve as a foundation for high-speed graph processing on big memory machines with non-uniform memory access and a large number of available cores.

References

  1. 1.
    InfiniteGraph project website. www.objectivity.com/infinitegraph
  2. 2.
    Neo4j project website. http://neo4j.com
  3. 3.
    OrientDB project website. http://www.orientdb.org/
  4. 4.
    Titan project website. http://thinkaurelius.github.io/titan
  5. 5.
    Aberger, C.R., Nötzli, A., Olukotun, K., Ré, C.: EmptyHeaded: boolean algebra based graph processing, CoRR abs/1503.02368 (2015)Google Scholar
  6. 6.
    Cui, Z., Chen, L., Chen, M., Bao, Y., Huang, Y., Lv, H.: Evaluation and optimization of breadth-first search on NUMA cluster. In: Proceedings of CLUSTER 2012, pp. 438–448 (2012). http://dx.doi.org/10.1109/CLUSTER.2012.29
  7. 7.
    Erling, O., Averbuch, A., Larriba-Pey, J., Chafi, H., Gubichev, A., Prat, A., Pham, M.D., Boncz, P.A.: The LDBC social network benchmark: interactive workload. In: Proceedings of SIGMOD 2015, pp. 619–630 (2015)Google Scholar
  8. 8.
    Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: Proceedings of OSDI 2014, pp. 599–613 (2014)Google Scholar
  9. 9.
    Hong, S., Chafi, H., Sedlar, E., Olukotun, K.: Green-Marl: a DSL for easy and efficient graph analysis. In: Proceedings of ASPLOS 2012, pp. 349–362 (2012)Google Scholar
  10. 10.
    Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI 2012, pp. 31–46. USENIX Association, Berkeley (2012)Google Scholar
  11. 11.
    Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)CrossRefGoogle Scholar
  12. 12.
    Macko, P., Marathe, V.J., Margo, D.W., Seltzer, M.I.: LLAMA: efficient graph analytics using large multiversioned arrays. In: Proceedings of ICDE 2015 (2015)Google Scholar
  13. 13.
    Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of SIGMOD 2010, pp. 135–146 (2010)Google Scholar
  14. 14.
    Martínez-Bazan, N., Águila Lorente, M.A., Muntés-Mulero, V., Dominguez-Sal, D., Gómez-Villamor, S., Larriba-Pey, J.L.: Efficient graph management based on bitmap indices. In: Proceedings of IDEAS 2012, pp. 110–119 (2012)Google Scholar
  15. 15.
    Paradies, M., Lehner, W., Bornhövd, C.: GRAPHITE: an extensible graph traversal framework for relational database management systems. In: Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM 2015, pp. 29:1–29:12 (2015)Google Scholar
  16. 16.
    Raman, R., van Rest, O., Hong, S., Wu, Z., Chafi, H., Banerjee, J.: PGX.ISO: parallel and efficient in-memory engine for subgraph isomorphism. In: Proceedings of GRADES 2014, pp. 5:1–5:6 (2014)Google Scholar
  17. 17.
    Rodriguez, M.A., Neubauer, P.: Constructions from dots and lines. Bull. Am. Soc. Inf. Sci. Technol. 36(6), 35–41 (2010)CrossRefGoogle Scholar
  18. 18.
    Rudolf, M., Paradies, M., Bornhövd, C., Lehner, W.: The graph story of the SAP HANA database. In: Proceedings of BTW 2013, pp. 403–420 (2013)Google Scholar
  19. 19.
    Sun, W., Fokoue, A., Srinivas, K., Kementsietsidis, A., Hu, G., Xie, G.: SQLGraph: an efficient relational-based property graph store. In: Proceedings of SIGMOD 2015 (2015)Google Scholar
  20. 20.
    Welc, A., Raman, R., Wu, Z., Hong, S., Chafi, H., Banerjee, J.: Graph analysis: do we have to reinvent the wheel? In: Proceedings of GRADES 2013, pp. 7:1–7:6 (2013)Google Scholar
  21. 21.
    Xia, Y., Tanase, I.G., Nai, L., Tan, W., Liu, Y., Crawford, J., Lin, C.: Explore efficient data organization for large scale graph analytics and storage. In: Proceedings of BigData 2014, pp. 942–951 (2014)Google Scholar
  22. 22.
    Zhang, K., Chen, R., Chen, H.: NUMA-aware graph-structured analytics. In: Proceedings of SIGPLAN 2015, pp. 183–193 (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Matthias Hauck
    • 1
  • Marcus Paradies
    • 2
  • Holger Fröning
    • 1
  • Wolfgang Lehner
    • 2
  • Hannes Rauhe
    • 3
  1. 1.Computer Engineering GroupRuprecht-Karls University of HeidelbergHeidelbergGermany
  2. 2.Database Systems GroupTu DresdenDresdenGermany
  3. 3.SAP SEWeinheimGermany

Personalised recommendations