The VLDB Journal

, Volume 22, Issue 2, pp 229–252

Lindex: a lattice-based index for graph databases

Regular Paper

Abstract

Subgraph querying has wide applications in various fields such as cheminformatics and bioinformatics. Given a query graph, q, a subgraph-querying algorithm retrieves all graphs, D(q), which have q as a subgraph, from a graph database, D. Subgraph querying is costly because it uses subgraph isomorphism tests, which are NP-complete. Graph indices are commonly used to improve the performance of subgraph querying in graph databases. Subgraph-querying algorithms first construct a candidate answer set by filtering out a set of false answers and then verify each candidate graph using subgraph isomorphism tests. To build graph indices, various kinds of substructure (subgraph, subtree, or path) features have been proposed with the goal of maximizing the filtering rate. Each of them works with a specifically designed index structure, for example, discriminative and frequent subgraph features work with gIndex, δ-TCFG features work with FG-index, etc. We propose Lindex, a graph index, which indexes subgraphs contained in database graphs. Nodes in Lindex represent key-value pairs where the key is a subgraph in a database and the value is a list of database graphs containing the key. We propose two heuristics that are used in the construction of Lindex that allows us to determine answers to subgraph queries conducting less subgraph isomorphism tests. Consequently, Lindex improves subgraph-querying efficiency. In addition, Lindex is compatible with any choice of features. Empirically, we demonstrate that Lindex used in conjunction with subgraph indexing features proposed in previous works outperforms other specifically designed index structures. As a novel index structure, Lindex (1) is effective in filtering false graphs (2) provides fast index lookups, (3) is fast with respect to index construction and maintenance, and (4) can be constructed using any set of substructure index features. These four properties result in a fast and scalable subgraph-querying infrastructure. We substantiate the benefits of Lindex and its disk-resident variation Lindex+ theoretically and empirically.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barnard J.: Substructure searching methods: old and new. J. CIM 33, 532–538 (1993)Google Scholar
  2. 2.
    Chen, C., Yan, X., Yu, P.S., Han, J., Zhang, D.Q., Gu, X.: Towards graph containment search and indexing. In: VLDB. VLDB Endowment (2007)Google Scholar
  3. 3.
    Cheng J., Ke Y., Fu A.W.C., Yu J.X.: Fast graph query processing with a low-cost index. VLDB J. 20, 521–539 (2011)CrossRefGoogle Scholar
  4. 4.
    Cheng J., Ke Y., Ng W.: Efficient query processing on graph databases. ACM Trans. Database Syst. 34, 2:1–2:48 (2009)CrossRefGoogle Scholar
  5. 5.
    Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: Towards verification-free query processing on graph databases. In: SIGMOD (2007)Google Scholar
  6. 6.
    Conte, D., Foggia, P., Sansone, C., Vento, M.: Graph matching applications in pattern recognition and image processing. In: ICIP (2003)Google Scholar
  7. 7.
    Cook, S.A.: The complexity of theorem-proving procedures. In: STOC, pp. 151–158 (1971)Google Scholar
  8. 8.
    El-Mehalawi M., Miller R.A.: A database system of mechanical components based on geometric and topological similarity. J. CAD 35(1), 83–94 (2003)Google Scholar
  9. 9.
    Han W.S., Lee J., Pham M.D., Yu J.X.: igraph: a framework for comparisons of disk-based graph indexing techniques. Proceedings of the VLDB Endowment 3, 449–459 (2010)Google Scholar
  10. 10.
    He, H., Singh, A.K.: Closure-tree: An index structure for graph queries. In: ICDE, p. 38 (2006)Google Scholar
  11. 11.
    Jiang, H., Wang, H., Yu, P., Zhou, S.: Gstring: A novel approach for efficient search in graph databases. In: ICDE, pp. 566–575 (2007)Google Scholar
  12. 12.
    Leach, A.R., Gillet, V.J.: Substructure searching. In: An introduction to chemoinformatics, pp. 10–12Google Scholar
  13. 13.
    Shang, H., Zhang, Y., Lin, X., Yu, J.X.: Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment (2008)Google Scholar
  14. 14.
    Shasha, D., Wang, J.T.L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS (2002)Google Scholar
  15. 15.
    Sun, B., Mitra, P., Giles, C.L.: Irredundant informative subgraph mining for graph search on the web. In: CIKM (2009)Google Scholar
  16. 16.
    Trinajstic N.: Chemical Graph Theory. Vol. 1, 2. 2nd edn. CRC Press, Boca Raton (1992)Google Scholar
  17. 17.
    Williams, D.W., Huan, J., Wang, W.: Graph database indexing using structured graph decomposition. In: ICDE (2007)Google Scholar
  18. 18.
    Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM, p. 721 (2002)Google Scholar
  19. 19.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD (2004)Google Scholar
  20. 20.
    Yuan, D., Mitra, P.: Lattice-based graph index for subgraph search. In: WebDB (2011)Google Scholar
  21. 21.
    Zhang, S., Hu, M., Yang, J.: Treepi: A novel graph indexing method. ICDE pp. 966–975 (2007)Google Scholar
  22. 22.
    Zhang, S., Li, J., Gao, H., Zou, Z.: A novel approach for efficient supergraph query processing on graph databases. In: EDBT, pp. 204–215 (2009)Google Scholar
  23. 23.
    Zhao, P., Yu, J.X., Yu, P.S.: Graph indexing: tree + delta ≤  graph. In: VLDB, pp. 938–949 (2007)Google Scholar
  24. 24.
    Zou, L., Chen, L., Yu, J.X., Lu, Y.: A novel spectral coding in a large graph database. In: EDBT, pp. 181–192 (2008)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringThe Pennsylvania State UniversityUniversity ParkUSA
  2. 2.College of Information Sciences and TechnologyThe Pennsylvania State UniversityUniversity ParkUSA

Personalised recommendations