Query Language and Access Methods for Graph Databases
With the prevalence of graph data in a variety of domains, there is an increasing need for a language to query and manipulate graphs with heterogeneous attributes and structures. We present a graph query language (GraphQL) that supports bulk operations on graphs with arbitrary structures and annotated at- tributes. In this language, graphs are the basic unit of information and each query manipulates one or more collections of graphs at a time. The core of GraphQL is a graph algebra extended from the relational algebra in which the selection operator is generalized to graph pattern matching and a composition operator is introduced for rewriting matched graphs. Then, we investigate access methods of the selection operator. Pattern matching over large graphs is challenging due to the NP-completeness of subgraph isomorphism. We address this by a combination of techniques: use of neighborhood subgraphs and pro- files, joint reduction of the search space, and optimization of the search order. Experimental results on real and synthetic large graphs demonstrate that graph specific optimizations outperform an SQL-based implementation by orders of magnitude.
KeywordsGraph query language Graph algebra Graph pattern matching
Unable to display preview. Download preview PDF.
- S. Al-Khalifa, H. V. Jagadish, J. M. Patel, Y. Wu, N. Koudas, and D. Srivastava. Structural joins: A primitive for efficient xml query pattern matching. In ICDE, pages 141–, 2002.Google Scholar
- S. Asthana et al. Predicting protein complex membership using probabilistic network reliability. Genome Research, May 2004.Google Scholar
- S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 23, 2001.Google Scholar
- S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML query language. W3C, http://www.w3.org/TR/xquery/,2007.
- C. Branden and J. Tooze. Introduction to protein structure. Garland, 2 edition, 1998.Google Scholar
- N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD Conference, pages 310–321, 2002.Google Scholar
- S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34–43, 1998.Google Scholar
- L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB ’05, pages 493–504, 2005.Google Scholar
- J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: towards verification-free query processing on graph databases. In Proc. of SIGMOD ’07, 2007.Google Scholar
- [10.] J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961–979, 2006.Google Scholar
- M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, 1990.Google Scholar
- Gene Ontology. http://www.geneontology.org/.
- R. H. Guting. GraphDB: Modeling and querying graphs in databases. In Proc. of VLDB’94, pages 297–308, 1994.Google Scholar
- M. Gyssens, J. Paredaens, and D. van Gucht. A graph-oriented object database model. In Proc. of PODS ’90, pages 417–424, 1990.Google Scholar
- H. He and A. K. Singh. Closure-Tree: An Index Structure for Graph Queries. In Proc. of ICDE ’06, Atlanta, USA, 2006.Google Scholar
- H. He and A. K. Singh. Graphs-at-a-time: Query Language and Access Methods for Graph Databases. In Proc. of SIGMOD ’08, pages 405–418, Vancouver, Canada, 2008.Google Scholar
- J. Hopcroft and R. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Computing, 1973.Google Scholar
- J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979.Google Scholar
- H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. of DBPL ’01, 2001.Google Scholar
- H. Jiang, H. Wang, P. S. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In ICDE, 2007.Google Scholar
- J. Lee, J. Oh, and S. Hwang. STRG-Index: Spatio-temporal region graph indexing for large video databases. In Proc. of SIGMOD, 2005.Google Scholar
- F. Manola and E. Miller. RDF Primer. W3C, http://www.w3.org/TR/rdf-primer/,2004.
- E. Prud’hommeaux and A. Seaborne. SPARQL query language for RDF. W3C, http://www.w3.org/TR/rdf-sparql-query/,2007.
- R. Ramakrishnan and J. Gehrke. Database Management Systems, chapter 24 Deductive Databases. McGraw-Hill, third edition, 2003.
- J. Rekers and A. Schurr. A graph grammar approach to graphical parsing. In 11th International IEEE Symposium on Visual Languages, 1995.Google Scholar
- G. Rozenberg (Ed.). Handbook on Graph Grammars and Computing by Graph Transformation: Foundations, volume 1. World Scientific, 1997.Google Scholar
- R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE ’05, pages 360–371, 2005.Google Scholar
- J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In VLDB, pages 302–314, 1999.Google Scholar
- D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of PODS, 2002.Google Scholar
- L. Sheng, Z. M. Ozsoyoglu, and G. Ozsoyoglu. A graph query language and its query processing. In ICDE, 1999.Google Scholar
- Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. SAGA: a subgraph matching tool for biological graphs. Bioinformatics, 23(2), 2007.Google Scholar
- S. Trißl and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD ’07, pages 845–856, 2007.Google Scholar
- H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE ’06, page 75, 2006.Google Scholar
- D. W. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.Google Scholar
- X. Yan, P. S. Yu, and J. Han. Graph Indexing: A frequent structure-based approach. In Proc. of SIGMOD, 2004.Google Scholar
- S. Zhang, M. Hu, and J. Yang. TreePi: A novel graph indexing method. In ICDE, 2007.Google Scholar
- P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In Proc. of VLDB, pages 938–949, 2007.Google Scholar