Query Language and Access Methods for Graph Databases

Part of the Advances in Database Systems book series (ADBS, volume 40)


With the prevalence of graph data in a variety of domains, there is an increasing need for a language to query and manipulate graphs with heterogeneous attributes and structures. We present a graph query language (GraphQL) that supports bulk operations on graphs with arbitrary structures and annotated at- tributes. In this language, graphs are the basic unit of information and each query manipulates one or more collections of graphs at a time. The core of GraphQL is a graph algebra extended from the relational algebra in which the selection operator is generalized to graph pattern matching and a composition operator is introduced for rewriting matched graphs. Then, we investigate access methods of the selection operator. Pattern matching over large graphs is challenging due to the NP-completeness of subgraph isomorphism. We address this by a combination of techniques: use of neighborhood subgraphs and pro- files, joint reduction of the search space, and optimization of the search order. Experimental results on real and synthetic large graphs demonstrate that graph specific optimizations outperform an SQL-based implementation by orders of magnitude.


Graph query language Graph algebra Graph pattern matching 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    S. Al-Khalifa, H. V. Jagadish, J. M. Patel, Y. Wu, N. Koudas, and D. Srivastava. Structural joins: A primitive for efficient xml query pattern matching. In ICDE, pages 141–, 2002.Google Scholar
  2. [2]
    S. Asthana et al. Predicting protein complex membership using probabilistic network reliability. Genome Research, May 2004.Google Scholar
  3. [3]
    S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 23, 2001.Google Scholar
  4. [4]
    S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML query language. W3C,,2007.
  5. [5]
    C. Branden and J. Tooze. Introduction to protein structure. Garland, 2 edition, 1998.Google Scholar
  6. [6]
    N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD Conference, pages 310–321, 2002.Google Scholar
  7. [7]
    S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34–43, 1998.Google Scholar
  8. [8]
    L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB ’05, pages 493–504, 2005.Google Scholar
  9. [9]
    J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: towards verification-free query processing on graph databases. In Proc. of SIGMOD ’07, 2007.Google Scholar
  10. [10.
    ] J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961–979, 2006.Google Scholar
  11. [11]
    E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338–1355, 2003.zbMATHCrossRefMathSciNetGoogle Scholar
  12. [12]
    M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, 1990.Google Scholar
  13. [13]
    P. Erdos and A. Renyi. On random graphs I. Publ. Math. Debrecen, (6):290–297, 1959.MathSciNetGoogle Scholar
  14. [14]
  15. [15]
    R. H. Guting. GraphDB: Modeling and querying graphs in databases. In Proc. of VLDB’94, pages 297–308, 1994.Google Scholar
  16. [16]
    M. Gyssens, J. Paredaens, and D. van Gucht. A graph-oriented object database model. In Proc. of PODS ’90, pages 417–424, 1990.Google Scholar
  17. [17]
    H. He and A. K. Singh. Closure-Tree: An Index Structure for Graph Queries. In Proc. of ICDE ’06, Atlanta, USA, 2006.Google Scholar
  18. [18]
    H. He and A. K. Singh. Graphs-at-a-time: Query Language and Access Methods for Graph Databases. In Proc. of SIGMOD ’08, pages 405–418, Vancouver, Canada, 2008.Google Scholar
  19. [19]
    J. Hopcroft and R. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Computing, 1973.Google Scholar
  20. [20]
    J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979.Google Scholar
  21. [21]
    H. V. Jagadish, S. Al-Khalifa, A. Chapman, L. V. S. Lakshmanan, A. Nierman, S. Paparizos, J. M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu. TIMBER: A native XML database. VLDB J., 11(4):274–291, 2002.zbMATHCrossRefGoogle Scholar
  22. [22]
    H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. of DBPL ’01, 2001.Google Scholar
  23. [23]
    H. Jiang, H. Wang, P. S. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In ICDE, 2007.Google Scholar
  24. [24]
    J. Lee, J. Oh, and S. Hwang. STRG-Index: Spatio-temporal region graph indexing for large video databases. In Proc. of SIGMOD, 2005.Google Scholar
  25. [25]
    U. Leser. A query language for biological networks. Bioinformatics, 21:ii33–ii39, 2005.CrossRefGoogle Scholar
  26. [26]
    F. Manola and E. Miller. RDF Primer. W3C,,2004.
  27. [27]
    E. Prud’hommeaux and A. Seaborne. SPARQL query language for RDF. W3C,,2007.
  28. [28]
    R. Ramakrishnan and J. Gehrke. Database Management Systems,  chapter 24 Deductive Databases. McGraw-Hill, third edition, 2003.
  29. [29]
    J. Rekers and A. Schurr. A graph grammar approach to graphical parsing. In 11th International IEEE Symposium on Visual Languages, 1995.Google Scholar
  30. [30]
    G. Rozenberg (Ed.). Handbook on Graph Grammars and Computing by Graph Transformation: Foundations, volume 1. World Scientific, 1997.Google Scholar
  31. [31]
    R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE ’05, pages 360–371, 2005.Google Scholar
  32. [32]
    N. Shadbolt, T. Berners-Lee, and W. Hall. The semantic web revisited. IEEE Intelligent Systems, 21(3):96–101, 2006.CrossRefGoogle Scholar
  33. [33]
    J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In VLDB, pages 302–314, 1999.Google Scholar
  34. [34]
    D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of PODS, 2002.Google Scholar
  35. [35]
    L. Sheng, Z. M. Ozsoyoglu, and G. Ozsoyoglu. A graph query language and its query processing. In ICDE, 1999.Google Scholar
  36. [36]
    Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. SAGA: a subgraph matching tool for biological graphs. Bioinformatics, 23(2), 2007.Google Scholar
  37. [37]
    S. Trißl and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD ’07, pages 845–856, 2007.Google Scholar
  38. [38]
    H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE ’06, page 75, 2006.Google Scholar
  39. [39]
    D. W. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.Google Scholar
  40. [40]
    X. Yan, P. S. Yu, and J. Han. Graph Indexing: A frequent structure-based approach. In Proc. of SIGMOD, 2004.Google Scholar
  41. [41]
    S. Zhang, M. Hu, and J. Yang. TreePi: A novel graph indexing method. In ICDE, 2007.Google Scholar
  42. [42]
    P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In Proc. of VLDB, pages 938–949, 2007.Google Scholar

Copyright information

© Springer-Verlag US 2010

Authors and Affiliations

  1. 1.Google Inc.Mountain ViewUSA
  2. 2.Department of Computer ScienceUniversity of California, Santa BarbaraSanta BarbaraUSA

Personalised recommendations