Advertisement

An Introduction to Graph Data Management

Chapter
Part of the Data-Centric Systems and Applications book series (DCSA)

Abstract

Graph data management concerns the research and development of powerful technologies for storing, processing and analyzing large volumes of graph data. This chapter presents an overview about the foundations and systems for graph data management. Specifically, we present a historical overview of the area, studied graph database models, characterized essential graph-oriented queries, reviewed graph query languages, and explore the features of current graph data management systems (i.e. graph databases and graph-processing frameworks).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

R. Angles and C. Gutierrez were supported by the Millennium Nucleus Center for Semantic Web Research under grant NC120004.

References

  1. Abadi DJ, Marcus A, Madden SR, Hollenbach K (2007) Scalable semantic web data management using vertical partitioning. In: Proceedings of the international conference on very large data bases (VLDB), pp 411–422Google Scholar
  2. Abiteboul S, Vianu V (1999) Regular path queries with constraints. J Comput Syst Sci 58:428–452MathSciNetCrossRefGoogle Scholar
  3. Abiteboul S, Quass D, McHugh J, Widom J, Wiener JL (1997) The Lorel query language for semistructured data. Int J Digit Libr 1(1):68–88CrossRefGoogle Scholar
  4. Aggarwal CC, Wang H (eds) (2010) Managing and mining graph data. Advances in database systems. Springer Science – Business Media, BerlinzbMATHGoogle Scholar
  5. Agrawal R, Jagadish HV (1987) Direct algorithms for computing the transitive closure of database relations. In: Proceedings of the international conference on very large data bases (VLDB). Morgan Kaufmann, Los Altos, pp 255–266Google Scholar
  6. Amann B, Scholl M (1992) Gram: a graph data model and query language. In: Proceedings of the European conference on hypertext technology (ECHT). ACM, New York, pp 201–211CrossRefGoogle Scholar
  7. Andries M, Gemis M, Paredaens J, Thyssens I, den Bussche JV (1992) Concepts for graph-oriented object manipulation. In: Proceedings of international conference on extending database technology (EDBT). Lecture notes in computer science, vol 580. Springer, Berlin, pp 21–38CrossRefGoogle Scholar
  8. Angles R (2012) A comparison of current graph database models. In: 4th international workshop on graph data management: techniques and applications (GDM). ICDE workshopGoogle Scholar
  9. Angles R, Gutierrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39CrossRefGoogle Scholar
  10. Angles R, Barceló P, Ríos G (2013) A practical query language for graph dbs. In: Proceedings of the Alberto Mendelzon international workshop on foundations of data management (AMW)Google Scholar
  11. Angles R, Arenas M, Barceló P, Hogan A, Reutter J, Vrgoĉ D (2017) Foundations of modern query languages for graph databases. ACM Comput Surv 50(5):68CrossRefGoogle Scholar
  12. Angles R, Arenas M, Barceló P, Boncz P, Fletcher G, Gutierrez C, Lindaaker T, Paradies M, Plantikow S, Sequeda J, van Rest O, Voigt H (2018) G-core: a core for future graph query languages. In: Proceedings of the international conference on management of data (SIGMOD)Google Scholar
  13. Atre M, Chaoji V, Zaki MJ, Hendler JA (2010) Matrix “bit” loaded: a scalable lightweight join query processor for RDF data. In: Proceedings of the international conference on World Wide Web. ACM, New York, pp 41–50Google Scholar
  14. Barceló Baeza P (2013) Querying graph databases. In: Proceedings of the symposium on principles of database systems (PODS). Invited tutorial. ACM, New York, pp 175–188CrossRefGoogle Scholar
  15. Berge C (1973) Graphs and hypergraphs. North-Holland, AmsterdamzbMATHGoogle Scholar
  16. Bornea MA, Dolby J, Kementsietsidis A, Srinivas K, Dantressangle P, Udrea O, Bhattacharjee B (2013) Building an efficient RDF store over a relational database. In: Proceedings of the international conference on management of data (SIGMOD). ACM, New York, pp 121–132Google Scholar
  17. Bray T, Paoli J, Sperberg-McQueen CM (1998) Extensible Markup Language (XML) 1.0, W3C Recommendation. http://www.w3.org/TR/1998/REC-177-19980210
  18. Brijder R, Gillis JJM, Van den Bussche J (2013) The DNA query language DNAQL. In: Proceedings of the international conference on database theory (ICDT). ACM, New York, pp 1–9Google Scholar
  19. Buneman P (1997) Semistructured data. In: Proceedings of the symposium on principles of database systems (PODS). ACM, New York, pp 117–121Google Scholar
  20. Chang CS, Chen ALP (1998) Supporting conceptual and neighborhood queries on the World Wide Web. IEEE Trans Syst Man Cybern 28(2):300–308CrossRefGoogle Scholar
  21. Chen PPS (1976) The entity-relationship model - toward a unified view of data. ACM Trans Database Syst 1(1):9–36CrossRefGoogle Scholar
  22. Chong EI, Das S, Eadon G, Srinivasan J (2005) An efficient SQL-based RDF querying scheme. In: Proceedings of the international conference on very large data bases. VLDB Endowment, pp 1216–1227Google Scholar
  23. Ciglan M, Averbuch A, Hluchy L (2012) Benchmarking traversal operations over graph databases. In: Proceedings of the international conference on data engineering workshops. IEEE Computer Society, New York, pp 186–189Google Scholar
  24. Codd EF (1970) A relational model of data for large shared data banks. Commun ACM 13(6):377–387CrossRefGoogle Scholar
  25. Consens MP, Mendelzon AO (1989) Expressing structural hypertext queries in graphlog. In: Proceedings of the conference on hypertext. ACM, New York, pp 269–292Google Scholar
  26. Consens MP, Mendelzon AO (1990) GraphLog: a visual formalism for real life recursion. In: Proceedings of the symposium on principles of database systems (PODS). ACM, New York, pp 404–416Google Scholar
  27. Consens M, Mendelzon A (1993) Hy+: a hygraph-based query and visualization system. SIGMOD Record 22(2):511–516CrossRefGoogle Scholar
  28. Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recognit Artif Intell 18(3):265–298CrossRefGoogle Scholar
  29. Cruz IF, Mendelzon AO, Wood PT (1987) A graphical query language supporting recursion. In: Proceedings of the international conference on management of data (SIGMOD). ACM, New York, pp 323–330Google Scholar
  30. Cruz IF, Mendelzon AO, Wood PT (1989) G+: recursive queries without recursion. In: Proceedings of the international conference on expert database systems (EDS). Addison-Wesley, Reading, pp 645–666Google Scholar
  31. Cudré-Mauroux P, Enchev I, Fundatureanu S, Groth P, Haque A, Harth A, Keppmann F, Miranker D, Sequeda J, Wylot M (2013) NoSQL databases for RDF: an empirical evaluation. In: Proceedings of the international semantic web conference (ISWC). Lecture notes in computer science, vol 8219. Springer, Berlin, pp 310–325CrossRefGoogle Scholar
  32. Dominguez-Sal D, Martinez-Bazan N, Muntes-Mulero V, Baleta P, Larriba-Pey JL (2010a) A discussion on the design of graph database benchmarks. In: Proceedings of the technology conference on performance evaluation and benchmarking (TPCTC)Google Scholar
  33. Dominguez-Sal D, Urbón-Bayes P, Giménez-Vañó A, Gómez-Villamor S, Martínez-Bazán N, Larriba-Pey JL (2010b) Survey of graph database performance on the HPC scalable graph analysis benchmark. In: Proceedings of the international conference on web-age information management (WAIM). Springer, Berlin, pp 37–48CrossRefGoogle Scholar
  34. Dries A, Nijssen S, De Raedt L (2009) A query language for analyzing networks. In: Proceedings of the conference on information and knowledge management (CIKM). ACM, New York, pp 485–494Google Scholar
  35. Elser B, Montresor A (2013) An evaluation study of BigData frameworks for graph processing. In: Proceedings of the international conference on big data. IEEE, New York, pp 60–67Google Scholar
  36. Erling O, Averbuch A, Larriba-Pey J, Chafi H, Gubichev A, Prat A, Pham MD, Boncz P (2015) The LDBC social network benchmark: interactive workload. In: Proceedings of the international conference on management of data. SIGMOD. ACM, New York, pp 619–630Google Scholar
  37. Fan W (2012) Graph pattern matching revised for social network analysis. In: Proceedings of the international conference on database theory (ICDT). ACM, New York, pp 8–21Google Scholar
  38. Faye DC, Cure O, Blin G (2012) A survey of RDF storage approaches. ARIMA J 15:11–35Google Scholar
  39. Gallagher B (2006) Matching structure and semantics: a survey on graph-based pattern matching. In: AAAI fall symposium on capturing and using patterns for evidence detection, pp 45–53Google Scholar
  40. Gemis M, Paredaens J (1993) An object-oriented pattern matching language. In: Proceedings of the international symposium on object technologies for advanced software. Springer, Berlin, pp 339–355CrossRefGoogle Scholar
  41. Graves M, Bergeman ER, Lawrence CB (1995) A graph-theoretic data model for genome mapping databases. In: Proceedings of the Hawaii international conference on system sciences (HICSS). IEEE Computer Society, New York, p 32Google Scholar
  42. Guo Y, Biczak M, Varbanescu AL, Iosup A, Martella C, Willke TL (2014) How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In: Proceedings of international parallel and distributed processing symposium. IEEE Computer Society, New York, pp 395–404Google Scholar
  43. Gutiérrez A, Pucheral P, Steffen H, Thévenin JM (1994) Database graph views: a practical model to manage persistent graphs. In: Proceedings of the international conference on very large data bases (VLDB). Morgan Kaufmann, Los Altos, pp 391–402Google Scholar
  44. Güting RH (1994) GraphDB: modeling and querying graphs in databases. In: Proceedings of the international conference on very large data bases (VLDB). Morgan Kaufmann, Los Altos, pp 297–308Google Scholar
  45. Gyssens M, Paredaens J, den Bussche JV, Gucht DV (1990) A graph-oriented object database model. In: Proceedings of the symposium on principles of database systems (PODS). ACM, New York, pp 417–424Google Scholar
  46. Han M, Daudjee K, Ammar K, Özsu MT, Wang X, Jin T (2014) An experimental comparison of pregel-like graph processing systems. Proc VLDB Endow 7(12):1047–1058CrossRefGoogle Scholar
  47. Harris S, Seaborne A (2013) SPARQL 1.1 Query Language, W3C Recommendation. https://www.w3.org/TR/sparql11-query/
  48. Harris S, Lamb N, Shadbolt N (2009) 4store: the design and implementation of a clustered RDF store. In: Proceedings of scalable semantic web knowledge base systems (SSWS), pp 94–109Google Scholar
  49. Hayes J, Gutierrez C (2004) Bipartite graphs as intermediate model for RDF. In: Proceedings of the international semantic web conference (ISWC). Lecture notes in computer science, vol 3298. Springer, Berlin, pp 47–61CrossRefGoogle Scholar
  50. He H, Singh AK (2008) Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of the international conference on management of data (SIGMOD). ACM, New York, pp 405–418Google Scholar
  51. Hidders J (2002) Typing graph-manipulation operations. In: Proceedings of the international conference on database theory (ICDT). Springer, Berlin, pp 394–409Google Scholar
  52. Hidders J, Paredaens J (1993) GOAL, a graph-based object and association language. In: Advances in database systems: implementations and applications. CISM. Springer, Wien, pp 247–265CrossRefGoogle Scholar
  53. Iosup A, Hegeman T, Ngai WL, Heldens S, Prat-Pérez A, Manhardto T, Chafio H, Capotă M, Sundaram N, Anderson M, Tănase IG, Xia Y, Nai L, Boncz P (2016) LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms. Proc VLDB Endow 9(13):1317–1328CrossRefGoogle Scholar
  54. Jouili S, Vansteenberghe V (2013) An empirical comparison of graph databases. In: Proceedings of the international conference on social computing (SocialCom), pp 708–715Google Scholar
  55. Khan A, Elnikety S (2014) Systems for big-graphs. In: Proceedings of the international conference on very large data bases (VLDB)Google Scholar
  56. Kiesel N, Schurr A, Westfechtel B (1996) GRAS: a graph-oriented software engineering database system. In: IPSEN book. Pergamon, New York, pp 397–425Google Scholar
  57. Kim W (1990) Object-oriented databases: definition and research directions. IEEE Trans Knowl Data Eng 2(3):327–341CrossRefGoogle Scholar
  58. Klyne G, Carroll J (2004) Resource description framework (RDF) concepts and abstract syntax. https://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
  59. Kotsev V, Minadakis N, Papakonstantinou V, Erling O, Fundulaki I, Kiryakov A (2016) Benchmarking RDF query engines: the LDBC semantic publishing benchmark. In: Proceedings of the workshop on benchmarking linked data, co-located with the international semantic web conference (ISWC)Google Scholar
  60. Kowalik L (2007) Adjacency queries in dynamic sparse graphs. Inform Process Lett 102:191–195MathSciNetCrossRefGoogle Scholar
  61. Kunii HS (1987) DBMS with graph data model for knowledge handling. In: Proceedings of the fall joint computer conference on exploring technology: today and tomorrow. IEEE Computer Society Press, Los Alamitos, pp 138–142Google Scholar
  62. Kuper GM, Vardi MY (1984) A new approach to database logic. In: Proceedings of the symposium on principles of database systems (PODS). ACM, New York, pp 86–96Google Scholar
  63. Levene M, Loizou G (1995) A graph-based data model and its ramifications. IEEE Trans Knowl Data Eng 7(5):809–823CrossRefGoogle Scholar
  64. Levene M, Poulovassilis A (1990) The hypernode model and its associated query language. In: Proceedings of the Jerusalem conference on information technology. IEEE Computer Society Press, Los Alamitos, pp 520–530Google Scholar
  65. Levene M, Poulovassilis A (1991) An object-oriented data model formalised through hypergraphs. Data Knowl Eng 6(3):205–224CrossRefGoogle Scholar
  66. Liu YA, Stoller SD (2006) Querying complex graphs. In: Proceedings of the international symposium on practical aspects of declarative languages. Springer, Berlin, pp 16–30Google Scholar
  67. Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein JM (2012) Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc VLDB Endow 5(8):716–727CrossRefGoogle Scholar
  68. Mainguenaud M (1992) Simatic XT: a data model to deal with multi-scaled networks. Comput Environ Urban Syst 16:281–288CrossRefGoogle Scholar
  69. Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the international conference on management of data (SIGMOD). ACM, New York, pp 135–146Google Scholar
  70. McColl R, Ediger D, Poovey J, Campbell D, Bader DA (2013) A brief study of open source graph databases. http://arxiv.org/abs/1309.2675
  71. McGuinness DL, van Harmelen F (2004) OWL web ontology language overview, W3C recommendation. https://www.w3.org/TR/owl-features/
  72. Mendelzon AO, Wood PT (1995) Finding regular simple paths in graph databases. SIAM J Comput 24(6):1235–1258MathSciNetCrossRefGoogle Scholar
  73. Morari A, Castellana V, Villa O, Tumeo A, Weaver J, Haglin D, Choudhury S, Feo J (2014) Scaling semantic graph databases in size and performance. IEEE Micro 34(4):16–26CrossRefGoogle Scholar
  74. Neo4j (2018) http://neo4j.com/
  75. Neumann T, Weikum G (2010) The RDF-3X engine for scalable management of RDF data. VLDB J 19(1):91–113CrossRefGoogle Scholar
  76. Papadopoulos AN, Manolopoulos Y (2005) Nearest neighbor search - a database perspective. Series in computer science. Springer, BerlinGoogle Scholar
  77. Papakonstantinou Y, Garcia-Molina H, Widom J (1995) Object exchange across heterogeneous information sources. In: Proceedings of the international conference on data engineering (ICDE). IEEE Computer Society, New York, pp 251–260Google Scholar
  78. Paredaens J, Peelman P, Tanca L (1995) G-Log: a graph-based query language. IEEE Trans Knowl Data Eng 7:436–453CrossRefGoogle Scholar
  79. Peckham J, Maryanski FJ (1988) Semantic data models. ACM Comput Surv 20(3):153–189CrossRefGoogle Scholar
  80. Poulovassilis A, Levene M (1994) A nested-graph model for the representation and manipulation of complex objects. ACM Trans Inform Syst 12(1):35–68CrossRefGoogle Scholar
  81. Prud’hommeaux E, Seaborne A (2008) SPARQL query language for RDF, W3C recommendation. https://www.w3.org/TR/rdf-sparql-query/
  82. Rodriguez MA (2015) The gremlin graph traversal machine and language (invited talk). In: Proceedings of the symposium on database programming languages. ACM, New York, pp 1–10Google Scholar
  83. Rodriguez MA, Neubauer P (2010) Constructions from dots and lines. Bull Am Soc Inf Sci Technol 36(6):35–41CrossRefGoogle Scholar
  84. Ronen R, Shmueli O (2009) SoQL: a language for querying and creating data in social networks. In: Proceedings of the international conference on data engineering (ICDE). IEEE Computer Society, New York, pp 1595–1602Google Scholar
  85. Roussopoulos N, Mylopoulos J (1975) Using semantic networks for database management. In: Proceedings of the international conference on very large data bases (VLDB). ACM, New York, pp 144–172Google Scholar
  86. Sakr S, Pardede E (2011) Graph data management: techniques and applications, 1st edn. IGI Global, HersheyGoogle Scholar
  87. Schmidt M, Hornung T, Küchlin N, Lausen G, Pinkel C (2008) An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In: Proceedings of the international semantic web conference (ISWC). Springer, Berlin, pp 82–97Google Scholar
  88. Shipman DW (1981) The functional data model and the data language DAPLEX. ACM Trans Database Syst 6(1):140–173CrossRefGoogle Scholar
  89. Stegmaier F, Grobner U, Dolller M, Kosch H, Baese G (2009) Evaluation of current RDF database solutions. In: Proceedings of the international workshop of the multimedia metadata community on semantic multimedia database technologies (SeMuDaTe)Google Scholar
  90. Theodoratos D (2002) Semantic integration and querying of heterogeneous data sources using a hypergraph data model. In: Proceedings of the British national conference on databases (BNCOD). Lecture notes in computer science. Springer, Berlin, pp 166–182zbMATHGoogle Scholar
  91. Tian Y, McEachin RC, Santos C, States DJ, Patel JM (2007) Saga: a subgraph matching tool for biological graphs. Bioinformatics 23(2):232–239CrossRefGoogle Scholar
  92. Tompa FW (1989) A data model for flexible hypertext database systems. ACM Trans Inform Syst 7(1):85–100CrossRefGoogle Scholar
  93. van Rest O, Hong S, Kim J, Meng X, Chafi H (2013) Pgql: a property graph query language. In: Proceedings of the international workshop on graph data management experiences and systems (GRADES)Google Scholar
  94. Vicknair C, Macias M, Zhao Z, Nan X, Chen Y, Wilkins D (2010) A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings annual southeast regional conference. ACM, New York, pp 1–6Google Scholar
  95. Watters C, Shepherd MA (1990) A transient hypergraph-based model for data access. ACM Trans Inform Syst 8(2):77–102CrossRefGoogle Scholar
  96. Weiss C, Karras P, Bernstein A (2008) Hexastore: sextuple indexing for semantic web data management. Proc VLDB Endow 1(1):1008–1019CrossRefGoogle Scholar
  97. Wood PT (1990) Factoring augmented regular chain programs. In: Proceedings of the international conference on very large data bases (VLDB). Morgan Kaufmann, Los Altos, pp 255–263Google Scholar
  98. Wood PT (2012) Query languages for graph databases. SIGMOD Record 41(1):50–60CrossRefGoogle Scholar
  99. Xin RS, Gonzalez JE, Franklin MJ, Stoica I (2013) GraphX: a resilient distributed graph system on spark. In: Proceedings of international workshop on graph data management experiences and systems (GRADES). ACM, New York, pp 1–6Google Scholar
  100. Yannakakis M (1990) Graph-theoretic methods in database theory. In: Proceedings of the symposium on principles of database systems (PODS). ACM, New York, pp 230–242Google Scholar
  101. Yuan P, Liu P, Wu B, Jin H, Zhang W, Liu L (2013) TripleBit: a fast and compact system for large scale RDF data. Proc VLDB Endow 6(7):517–528CrossRefGoogle Scholar
  102. Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. Proc VLDB Endow 6(4):265–276CrossRefGoogle Scholar
  103. Zhao Y, Yoshigoe K, Xie M, Zhou S, Seker R, Bian J (2014) Evaluation and analysis of distributed graph-parallel processing frameworks. J Cyber Secur Mobil 3(3):289–316CrossRefGoogle Scholar
  104. Zhu AD, Ma H, Xiao X, Luo S, Tang Y, Zhou S (2013) Shortest path and distance queries on road networks: towards bridging theory and practice. In: Proceedings of the international conference on management of data (SIGMOD). ACM, New York, pp 857–868Google Scholar
  105. Zou L, Özsu M, Chen L, Shen X, Huang R, Zhao D (2014) gStore: a graph-based SPARQL query engine. VLDB J 23(4):565–590CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversidad de Talca, Curicó, Chile – and – Millennium Institute for Foundational Research on DataSantiagoChile
  2. 2.Department of Computer ScienceUniversidad de ChileSantiagoChile
  3. 3.Millennium Institute for Foundational Research on DataSantiagoChile

Personalised recommendations