Advertisement

Query Language and Access Methods for Graph Databases

  • Huahai He
  • Ambuj K. Singh
Chapter
Part of the Advances in Database Systems book series (ADBS, volume 40)

Abstract

With the prevalence of graph data in a variety of domains, there is an increasing need for a language to query and manipulate graphs with heterogeneous attributes and structures. We present a graph query language (GraphQL) that supports bulk operations on graphs with arbitrary structures and annotated at- tributes. In this language, graphs are the basic unit of information and each query manipulates one or more collections of graphs at a time. The core of GraphQL is a graph algebra extended from the relational algebra in which the selection operator is generalized to graph pattern matching and a composition operator is introduced for rewriting matched graphs. Then, we investigate access methods of the selection operator. Pattern matching over large graphs is challenging due to the NP-completeness of subgraph isomorphism. We address this by a combination of techniques: use of neighborhood subgraphs and pro- files, joint reduction of the search space, and optimization of the search order. Experimental results on real and synthetic large graphs demonstrate that graph specific optimizations outperform an SQL-based implementation by orders of magnitude.

Keywords

Graph query language Graph algebra Graph pattern matching 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    S. Al-Khalifa, H. V. Jagadish, J. M. Patel, Y. Wu, N. Koudas, and D. Srivastava. Structural joins: A primitive for efficient xml query pattern matching. In ICDE, pages 141–, 2002.Google Scholar
  2. [2]
    S. Asthana et al. Predicting protein complex membership using probabilistic network reliability. Genome Research, May 2004.Google Scholar
  3. [3]
    S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 23, 2001.Google Scholar
  4. [4]
    S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML query language. W3C, http://www.w3.org/TR/xquery/,2007.
  5. [5]
    C. Branden and J. Tooze. Introduction to protein structure. Garland, 2 edition, 1998.Google Scholar
  6. [6]
    N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD Conference, pages 310–321, 2002.Google Scholar
  7. [7]
    S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34–43, 1998.Google Scholar
  8. [8]
    L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB ’05, pages 493–504, 2005.Google Scholar
  9. [9]
    J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: towards verification-free query processing on graph databases. In Proc. of SIGMOD ’07, 2007.Google Scholar
  10. [10.
    ] J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961–979, 2006.Google Scholar
  11. [11]
    E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338–1355, 2003.MATHCrossRefMathSciNetGoogle Scholar
  12. [12]
    M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, 1990.Google Scholar
  13. [13]
    P. Erdos and A. Renyi. On random graphs I. Publ. Math. Debrecen, (6):290–297, 1959.MathSciNetGoogle Scholar
  14. [14]
  15. [15]
    R. H. Guting. GraphDB: Modeling and querying graphs in databases. In Proc. of VLDB’94, pages 297–308, 1994.Google Scholar
  16. [16]
    M. Gyssens, J. Paredaens, and D. van Gucht. A graph-oriented object database model. In Proc. of PODS ’90, pages 417–424, 1990.Google Scholar
  17. [17]
    H. He and A. K. Singh. Closure-Tree: An Index Structure for Graph Queries. In Proc. of ICDE ’06, Atlanta, USA, 2006.Google Scholar
  18. [18]
    H. He and A. K. Singh. Graphs-at-a-time: Query Language and Access Methods for Graph Databases. In Proc. of SIGMOD ’08, pages 405–418, Vancouver, Canada, 2008.Google Scholar
  19. [19]
    J. Hopcroft and R. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Computing, 1973.Google Scholar
  20. [20]
    J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979.Google Scholar
  21. [21]
    H. V. Jagadish, S. Al-Khalifa, A. Chapman, L. V. S. Lakshmanan, A. Nierman, S. Paparizos, J. M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu. TIMBER: A native XML database. VLDB J., 11(4):274–291, 2002.MATHCrossRefGoogle Scholar
  22. [22]
    H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. of DBPL ’01, 2001.Google Scholar
  23. [23]
    H. Jiang, H. Wang, P. S. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In ICDE, 2007.Google Scholar
  24. [24]
    J. Lee, J. Oh, and S. Hwang. STRG-Index: Spatio-temporal region graph indexing for large video databases. In Proc. of SIGMOD, 2005.Google Scholar
  25. [25]
    U. Leser. A query language for biological networks. Bioinformatics, 21:ii33–ii39, 2005.CrossRefGoogle Scholar
  26. [26]
    F. Manola and E. Miller. RDF Primer. W3C, http://www.w3.org/TR/rdf-primer/,2004.
  27. [27]
    E. Prud’hommeaux and A. Seaborne. SPARQL query language for RDF. W3C, http://www.w3.org/TR/rdf-sparql-query/,2007.
  28. [28]
    R. Ramakrishnan and J. Gehrke. Database Management Systems,  chapter 24 Deductive Databases. McGraw-Hill, third edition, 2003.
  29. [29]
    J. Rekers and A. Schurr. A graph grammar approach to graphical parsing. In 11th International IEEE Symposium on Visual Languages, 1995.Google Scholar
  30. [30]
    G. Rozenberg (Ed.). Handbook on Graph Grammars and Computing by Graph Transformation: Foundations, volume 1. World Scientific, 1997.Google Scholar
  31. [31]
    R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE ’05, pages 360–371, 2005.Google Scholar
  32. [32]
    N. Shadbolt, T. Berners-Lee, and W. Hall. The semantic web revisited. IEEE Intelligent Systems, 21(3):96–101, 2006.CrossRefGoogle Scholar
  33. [33]
    J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In VLDB, pages 302–314, 1999.Google Scholar
  34. [34]
    D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of PODS, 2002.Google Scholar
  35. [35]
    L. Sheng, Z. M. Ozsoyoglu, and G. Ozsoyoglu. A graph query language and its query processing. In ICDE, 1999.Google Scholar
  36. [36]
    Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. SAGA: a subgraph matching tool for biological graphs. Bioinformatics, 23(2), 2007.Google Scholar
  37. [37]
    S. Trißl and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD ’07, pages 845–856, 2007.Google Scholar
  38. [38]
    H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE ’06, page 75, 2006.Google Scholar
  39. [39]
    D. W. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.Google Scholar
  40. [40]
    X. Yan, P. S. Yu, and J. Han. Graph Indexing: A frequent structure-based approach. In Proc. of SIGMOD, 2004.Google Scholar
  41. [41]
    S. Zhang, M. Hu, and J. Yang. TreePi: A novel graph indexing method. In ICDE, 2007.Google Scholar
  42. [42]
    P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In Proc. of VLDB, pages 938–949, 2007.Google Scholar

Copyright information

© Springer-Verlag US 2010

Authors and Affiliations

  1. 1.Google Inc.Mountain ViewUSA
  2. 2.Department of Computer ScienceUniversity of California, Santa BarbaraSanta BarbaraUSA

Personalised recommendations