Abstract
Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the possibility of vendor lock-in. To avoid this threat, vendors are working on supporting existing standard languages (e.g. SQL) or standardising languages.
In this paper, we present a formal specification for openCypher, a high-level declarative graph query language with an ongoing standardisation effort. We introduce relational graph algebra, which extends relational operators by adapting graph-specific operators and define a mapping from core openCypher constructs to this algebra. We propose an algorithm that allows systematic compilation of openCypher queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Requiring uniqueness of edges is called edge isomorphic matching. Other query languages and execution engines might use vertex isomorphic matching (requiring uniqueness of vertices), isomorphic matching (requiring uniqueness of both vertices and edges) or homomorphic matching (not requiring uniqueness of either) [8].
- 3.
The term result set refers to the result collection, which can be a set, a bag or a list.
- 4.
The functions return a single scalar value, while returns a list.
- 5.
Decision on grouping semantics is due after the camera ready submission deadline. The semantics presented in this paper is one of the possible approaches.
- 6.
In openCypher, the filtering operation is a subclause of and . When used in as illustrated on line 3 of List. 3.13, is similar to the construct of SQL with the major difference that, in openCypher it is also allowed when no aggregation was specified in the query.
- 7.
Patterns in the openCypher query might contain anonymous vertices and edges. In the algebraic form, we denote this with names starting with an underscore, such as \(\mathtt {\_v1}\) and \(\mathtt {\_e2}\).
- 8.
Label and type constraints can be omitted for the get-vertices operator and the expand operators. For example, returns all vertices, while traverses all outgoing edges e from vertices v to w, regardless of their labels/types.
- 9.
Matching semantics might use value equality of attributes that share a common name (similarly to natural join) to use an arbitrary condition (similarly to \(\theta \)-join).
- 10.
SQL implementations offer the OFFSET and the LIMIT/TOP keywords.
- 11.
Our prototype, ingraph, is available at: http://docs.inf.mit.bme.hu/ingraph/.
References
Arendt, T., Biermann, E., Jurack, S., Krause, C., Taentzer, G.: Henshin: advanced concepts and tools for in-place EMF model transformations. In: Petriu, D.C., Rouquette, N., Haugen, Ø. (eds.) MODELS 2010. LNCS, vol. 6394, pp. 121–135. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16145-2_9
Bergmann, G., Horváth, Á., Ráth, I., Varró, D., Balogh, A., Balogh, Z., Ökrös, A.: Incremental evaluation of model queries over EMF Models. In: Petriu, D.C., Rouquette, N., Haugen, Ø. (eds.) MODELS 2010. LNCS, vol. 6394, pp. 76–90. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16145-2_6
Botoeva, E., et al.: OBDA beyond relational DBs: a study for MongoDB. In: Proceedings of the 29th International Workshop on Description Logics (2016)
Elmasri, R., Navathe, S.B.: Fundamentals of Database Systems, 3rd edn. Addison-Wesley-Longman, Boston (2000)
Erling, O., et al.: The LDBC social network benchmark: interactive workload. In: SIGMOD, pp. 619–630 (2015)
Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems - The Complete Book, 2nd edn. Pearson Education, Harlow (2009)
Hölsch, J., Grossniklaus, M.: An algebra and equivalences to transform graph patterns in Neo4j. In: GraphQ at EDBT/ICDT (2016)
Junghanns, M., et al.: Cypher-based graph pattern matching in GRADOOP. In: GRADES at SIGMOD (2017)
Kolovos, D.S., Paige, R.F., Polack, F.A.C.: The epsilon transformation language. In: Vallecillo, A., Gray, J., Pierantonio, A. (eds.) ICMT 2008. LNCS, vol. 5063, pp. 46–60. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69927-9_4
Krause, C., Johannsen, D., Deeb, R., Sattler, K.-U., Knacker, D., Niadzelka, A.: An SQL-based query language and engine for graph pattern matching. In: Echahed, R., Minas, M. (eds.) ICGT 2016. LNCS, vol. 9761, pp. 153–169. Springer, Cham (2016). doi:10.1007/978-3-319-40530-8_10
Li, C., Chang, K.C., Ilyas, I.F., Song, S.: RankSQL: query algebra and optimization for relational top-k queries. In: SIGMOD, pp. 131–142 (2005)
Libkin, L., et al.: Querying graphs with data. J. ACM 63(2), 14:1–14:53 (2016)
Neo Technology. openCypher project (2017). http://www.opencypher.org/
Pérez, J., et al.: Semantics and complexity of SPARQL. ACM TODS 34(3), 1–45 (2009)
Rodriguez, M.A.: A collectively generated model of the world. In: Collective Intelligence: Creating a Prosperous World at Peace, pp. 261–264 (2008)
Rodriguez, M.A.: The gremlin graph traversal machine and language (invited talk). In: DBPL, pp. 1–10 (2015)
Rodriguez, M.A., Neubauer, P.: The graph traversal pattern. In: Graph Data Management: Techniques and Applications, pp. 29–46 (2011)
Rudolf, M., et al.: The graph story of the SAP HANA database. In: BTW (2013)
Sakr, S., Elnikety, S., He, Y.: G-SPARQL: a hybrid engine for querying large attributed graphs. In: CIKM, pp. 335–344 (2012)
Szárnyas, G., Izsó, B., Ráth, I., Harmath, D., Bergmann, G., Varró, D.: IncQuery-D: a distributed incremental model query framework in the cloud. In: Dingel, J., Schulte, W., Ramos, I., Abrahão, S., Insfran, E. (eds.) MODELS 2014. LNCS, vol. 8767, pp. 653–669. Springer, Cham (2014). doi:10.1007/978-3-319-11653-2_40
Szárnyas, G., et al.: The train benchmark: cross-technology performance evaluation of continuous model validation. Softw. Syst. Model. 1–29 (2017)
Szárnyas, G., Maginecz, J., Varró, D.: Evaluation of optimization strategies for incremental graph queries. Periodica Polytechnica, EECS (2017)
Szárnyas, G., Marton, J.: Formalisation of openCypher queries in relational algebra. Technical report, Budapest University of Technology and Economics (2017). http://hdl.handle.net/10890/5395
W3C. Resource Description Framework (2014). https://www.w3.org/RDF/
Acknowledgements
Gábor Szárnyas and Dániel Varró were supported by the MTA-BME Lendület Research Group on Cyber-Physical Systems and the NSERC RGPIN-04573-16 project. The authors would like to thank Gábor Bergmann and János Maginecz for their comments on the draft of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Marton, J., Szárnyas, G., Varró, D. (2017). Formalising openCypher Graph Queries in Relational Algebra. In: Kirikova, M., Nørvåg, K., Papadopoulos, G. (eds) Advances in Databases and Information Systems. ADBIS 2017. Lecture Notes in Computer Science(), vol 10509. Springer, Cham. https://doi.org/10.1007/978-3-319-66917-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-66917-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66916-8
Online ISBN: 978-3-319-66917-5
eBook Packages: Computer ScienceComputer Science (R0)