Model-Driven Engineering of an OpenCypher Engine: Using Graph Queries to Compile Graph Queries

  • József MartonEmail author
  • Gábor Szárnyas
  • Márton Búr
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10567)


Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. Many challenging applications with near real-time requirements—such as financial fraud detection, on-the-fly model validation and root cause analysis—can be formalised as graph problems and tackled with graph databases efficiently. However, as no standard graph query language has yet emerged, users are subjected to the possibility of vendor lock-in.

The openCypher group aims to define an open specification for a declarative graph query language. However, creating an openCypher-compatible query engine requires significant research and engineering efforts. Meanwhile, model-driven language workbenches support the creation of domain-specific languages by providing high-level tools to create parsers, editors and compilers. In this paper, we present an approach to build a compiler and optimizer for openCypher using model-driven technologies, which allows developers to define declarative optimization rules.



The second and third authors of this work were partially supported by the MTA-BME Lendület Research Group on Cyber-Physical Systems. We would like to thank János Maginecz and Dávid Szakállas for their contributions to the relational graph algebra model. We are also grateful to András Vörös and Gábor Bergmann for their suggestions and comments on the draft of this paper.


  1. 1.
    Ambite, J.L., Knoblock, C.A.: Planning by rewriting. J. Artif. Intell. Res. 15, 207–261 (2001)zbMATHGoogle Scholar
  2. 2.
    Apache Software Foundation. Apache Jena.
  3. 3.
    Arendt, T., Biermann, E., Jurack, S., Krause, C., Taentzer, G.: Henshin: advanced concepts and tools for in-place EMF model transformations. In: Petriu, D.C., Rouquette, N., Haugen, Ø. (eds.) MODELS 2010. LNCS, vol. 6394, pp. 121–135. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-16145-2_9 CrossRefGoogle Scholar
  4. 4.
    Armbrust, M., et al.: Spark SQL: relational data processing in Spark. In: SIGMOD, pp. 1383–1394 (2015)Google Scholar
  5. 5.
    Bergmann, G., Horváth, Á., Ráth, I., Varró, D., Balogh, A., Balogh, Z., Ökrös, A.: Incremental evaluation of model queries over EMF models. In: Petriu, D.C., Rouquette, N., Haugen, Ø. (eds.) MODELS 2010. LNCS, vol. 6394, pp. 76–90. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-16145-2_6 CrossRefGoogle Scholar
  6. 6.
    Bryla, B., Loney, K.: Oracle Database 12C The Complete Reference, 1st edn. McGraw-Hill Osborne Media, USA (2013)Google Scholar
  7. 7.
    Budapest University of Technology and Economics, Department of Measurement and Information Systems. Model-based Demonstrator for Smart and Safe Systems (2015).
  8. 8.
    Bunke, H., Glauser, T., Tran, T.-H.: An efficient implementation of graph grammars based on the RETE matching algorithm. In: Ehrig, H., Kreowski, H.-J., Rozenberg, G. (eds.) Graph Grammars 1990. LNCS, vol. 532, pp. 174–189. Springer, Heidelberg (1991). doi: 10.1007/BFb0017389 CrossRefGoogle Scholar
  9. 9.
    Búr, M., Ujhelyi, Z., Horváth, Á., Varró, D.: Local search-based pattern matching features in EMF-IncQuery. In: Parisi-Presicce, F., Westfechtel, B. (eds.) ICGT 2015. LNCS, vol. 9151, pp. 275–282. Springer, Cham (2015). doi: 10.1007/978-3-319-21145-9_18 CrossRefGoogle Scholar
  10. 10.
    Eclipse Foundation. RDF4J.
  11. 11.
    Eclipse Foundation. Xtend - Modernized Java.
  12. 12.
    Eclipse Foundation. Xcore (2017).
  13. 13.
    Erdweg, S., et al.: The state of the art in language workbenches - conclusions from the language workbench challenge. In: Erwig, M., Paige, R.F., Wyk, E. (eds.) SLE 2013. LNCS, vol. 8225, pp. 197–217. Springer, Cham (2013). doi: 10.1007/978-3-319-02654-1_11 CrossRefGoogle Scholar
  14. 14.
    Eysholdt, M., Behrens, H.: Xtext: implement your language faster than the quick and dirty way. In: SIGPLAN, SPLASH/OOPSLA, pp. 307–309 (2010)Google Scholar
  15. 15.
    Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems - The Complete Book, 2nd edn. Pearson Education, London (2009)Google Scholar
  16. 16.
    Geiß, R., Batz, G.V., Grund, D., Hack, S., Szalkowski, A.: GrGen: a fast SPO-based graph rewriting tool. In: Corradini, A., Ehrig, H., Montanari, U., Ribeiro, L., Rozenberg, G. (eds.) ICGT 2006. LNCS, vol. 4178, pp. 383–397. Springer, Heidelberg (2006). doi: 10.1007/11841883_27 CrossRefGoogle Scholar
  17. 17.
    Hegedüs, Á., Horváth, Á., Varró, D.: A model-driven framework for guided design space exploration. Autom. Softw. Eng. 22(3), 399–436 (2015)CrossRefGoogle Scholar
  18. 18.
    Hölsch, J., Grossniklaus, M.: An algebra and equivalences to transform graph patterns in Neo4j. In: GraphQ at EDBT/ICDT (2016)Google Scholar
  19. 19.
    Junghanns, M., et al.: Cypher-based graph pattern matching in Gradoop. In: GRADES at SIGMOD (2017)Google Scholar
  20. 20.
    Koenig, D., Glover, A., King, P., Laforge, G., Skeet, J.: Groovy in Action. Manning Publications Co., Greenwich (2007)Google Scholar
  21. 21.
    Kolovos, D.S., Paige, R.F., Polack, F.A.C.: The epsilon transformation language. In: Vallecillo, A., Gray, J., Pierantonio, A. (eds.) ICMT 2008. LNCS, vol. 5063, pp. 46–60. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-69927-9_4 CrossRefGoogle Scholar
  22. 22.
    Leblebici, E., Anjorin, A., Schürr, A.: Developing eMoflon with eMoflon. In: Ruscio, D., Varró, D. (eds.) ICMT 2014. LNCS, vol. 8568, pp. 138–145. Springer, Cham (2014). doi: 10.1007/978-3-319-08789-4_10 Google Scholar
  23. 23.
    Marton, J., Szárnyas, G., Varró, D.: Formalising openCypher graph queries in relational algebra. In: Martite, K., Kjetil, N., George, A.P. (eds.) Advances in Databases and Information Systems: 21st European Conference on Advances in Databases and Information Systems. Conference location and date: Nicosia, Ciprus, 2017-09-24-2017-09-27. LNCS. Springer (2017). ISBN: 978-3-319-66916-8
  24. 24.
    Neo Technology. Neo4j.
  25. 25.
    Neo Technology. openCypher project (2017).
  26. 26.
    Nickel, U., Niere, J., Zündorf, A.: The FUJABA environment. In: ICSE, pp. 742–745. ACM (2000)Google Scholar
  27. 27.
    OrientDB LTD. OrientDB graph-document NoSQL DBMS.
  28. 28.
    Pérez, J., et al.: Semantics and complexity of SPARQL. ACM TODS 34(3), 16 (2009)CrossRefGoogle Scholar
  29. 29.
    Robinson, I., Webber, J., Eifrém, E.: Graph Databases, 2nd edn. O’Reilly Media, Sebastopol (2015)Google Scholar
  30. 30.
    Rodriguez, M.A.: A collectively generated model of the world. In: Collective Intelligence: Creating a Prosperous World at Peace, pp. 261–264 (2008)Google Scholar
  31. 31.
    Rodriguez, M.A.: The Gremlin graph traversal machine and language (invited talk). In: DBPL, pp. 1–10 (2015)Google Scholar
  32. 32.
    Rodriguez, M.A., Neubauer, P.: Constructions from dots and lines. Bull. Am. Soc. Inform. Sci. Technol. 36(6), 35–41 (2010)CrossRefGoogle Scholar
  33. 33.
    Rodriguez, M.A., Neubauer, P.: The graph traversal pattern. In: Graph Data Management: Techniques and Applications, pp. 29–46 (2011)Google Scholar
  34. 34.
    Schürr, A., et al.: Handbook of graph grammars and computing by graph transformation, pp. 487–550. World Scientific Publishing Co., Inc. (1999)Google Scholar
  35. 35.
    Silberschatz, A., Korth, H.F., Sudarshan, S.: Database System Concepts, 5th edn. McGraw-Hill Book Company, Boston (2005)zbMATHGoogle Scholar
  36. 36.
    Sparsity-technologies. Sparksee high-performance graph database.
  37. 37.
    Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: EMF: Eclipse Modeling Framework 2.0, 2nd edn. Addison-Wesley Professional, Amsterdam (2009)Google Scholar
  38. 38.
    Szárnyas, G., Izsó, B., Ráth, I., Harmath, D., Bergmann, G., Varró, D.: IncQuery-D: a distributed incremental model query framework in the cloud. In: Dingel, J., Schulte, W., Ramos, I., Abrahão, S., Insfran, E. (eds.) MODELS 2014. LNCS, vol. 8767, pp. 653–669. Springer, Cham (2014). doi: 10.1007/978-3-319-11653-2_40 Google Scholar
  39. 39.
    Szárnyas, G., et al.: The Train Benchmark: Cross-technology performance evaluation of continuous model validation. Softw. Syst. Model. (2017).
  40. 40.
  41. 41.
    Ujhelyi, Z., et al.: EMF-IncQuery: an integrated development environment for live model queries. Sci. Comput. Program. 98, 80–99 (2015)CrossRefGoogle Scholar
  42. 42.
    Varró, D.: Automated program generation for and by model transformation systems. In: AGT, pp. 161–174 (2002)Google Scholar
  43. 43.
    Varró, D., et al.: Road to a reactive and incremental model transformation platform: three generations of the VIATRA framework. Softw. Syst. Model. 15(3), 609–629 (2016)CrossRefGoogle Scholar
  44. 44.
    Varró, G., Deckwerth, F.: A rete network construction algorithm for incremental pattern matching. In: Duddy, K., Kappel, G. (eds.) ICMT 2013. LNCS, vol. 7909, pp. 125–140. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38883-5_13 CrossRefGoogle Scholar
  45. 45.
    Varró, G., et al.: An algorithm for generating model-sensitive search plans for pattern matching on EMF models. Softw. Syst. Model. 14(2), 597–621 (2015)CrossRefGoogle Scholar
  46. 46.
    Varró, G., Friedl, K., Varró, D.: Adaptive graph pattern matching for model transformations using model-sensitive search plans. Electron. Notes Theor. Comput. Sci. 152, 191–205 (2006)CrossRefGoogle Scholar
  47. 47.
    W3C. Resource Description Framework (2014).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Database LaboratoryBudapest University of Technology and EconomicsBudapestHungary
  2. 2.Fault Tolerant Systems Research GroupBudapest University of Technology and EconomicsBudapestHungary
  3. 3.MTA-BME Lendület Research Group on Cyber-Physical SystemsBudapestHungary

Personalised recommendations