Efficient Subgraph Similarity All-Matching

  • Gaoping Zhu
  • Ke Zhu
  • Wenjie Zhang
  • Xuemin Lin
  • Chuan Xiao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7238)

Abstract

Being a fundamental problem in managing graph data, subgraph exact all-matching enumerates all isomorphic matches of a query graph q in a large data graph G. The existing techniques focus on pruning non-promising data graph vertices against q. However, the reduction and sharing of intermediate matches have not received adequate attention. These two issues become more critical on subgraph similarity all-matching due to the (possibly) massive number of intermediate matches. This paper studies the problem of efficient subgraph similarity all-matching by developing a novel query processing framework. We propose to effectively decompose a query graph into a hierarchical structure with the aim to minimize the number of intermediate matches and share intermediate matches. Novel techniques are then developed to estimate the number of intermediate matches, efficiently merge the intermediate matches, and generate efficient query execution plans. Experimental on real and synthetic datasets show that our approach outperforms the state-of-the-art approach for orders of magnitude.

Keywords

Local Pattern Data Graph Global Pattern Similarity Match Local Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bunke, H., Foggia, P., Guidobaldi, C., Sansone, C., Vento, M.: A comparison of algorithms for maximum common subgraph on randomly connected graphs. In: SSPR/SPR, pp. 123–132 (2002)Google Scholar
  2. 2.
    Chen, C., Yan, X., Yu, P.S., Han, J., Zhang, D.-Q., Gu, X.: Towards graph containment search and indexing. In: VLDB, pp. 926–937 (2007)Google Scholar
  3. 3.
    Cordella, L., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: 3rd Workshop on Graph-based Representations in Pattern Recognition, pp. 149–159 (2001)Google Scholar
  4. 4.
    He, H., Singh, A.K.: Closure-tree: An index structure for graph queries. In: ICDE, p. 38 (2006)Google Scholar
  5. 5.
    Jiang, H., Wang, H., Yu, P.S., Zhou, S.: Gstring: A novel approach for efficient search in graph databases. In: ICDE, pp. 566–575 (2007)Google Scholar
  6. 6.
    Krissinel, E.B., Henrick, K.: Common subgraph isomorphism detection by backtracking search. Softw. Pract. Exper. 34(6), 591–607 (2004)CrossRefGoogle Scholar
  7. 7.
    McGregor, J.J.: Backtrack search algorithms and the maximal common subgraph problem. Softw. Pract. Exper. 12(1), 23–34 (1982)MATHCrossRefGoogle Scholar
  8. 8.
    Shang, H., Lin, X., Zhang, Y., Yu, J.X., Wang, W.: Connected substructure similarity search. In: SIGMOD, pp. 903–914 (2010)Google Scholar
  9. 9.
    Shang, H., Zhang, Y., Lin, X., Yu, J.X.: Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB 1(1), 364–375 (2008)Google Scholar
  10. 10.
    Shasha, D., Wang, J.T.-L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: PODS, pp. 39–52 (2002)Google Scholar
  11. 11.
    Tian, Y., Patel, J.M.: Tale: A tool for approximate large graph matching. In: ICDE, pp. 963–972 (2008)Google Scholar
  12. 12.
    Yan, X., Han, P.S.Y.J.: Substructure similarity search in graph databases. In: SIGMOD, pp. 766–777 (2005)Google Scholar
  13. 13.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: SIGMOD Conference, pp. 335–346 (2004)Google Scholar
  14. 14.
    Zhang, S., Hu, M., Yang, J.: Treepi: A novel graph indexing method. In: ICDE, pp. 966–975 (2007)Google Scholar
  15. 15.
    Zhang, S., Li, J., Gao, H., Zou, Z.: A novel approach for efficient supergraph query processing on graph databases. In: EDBT, pp. 204–215 (2009)Google Scholar
  16. 16.
    Zhang, S., Li, S., Yang, J.: Gaddi: distance index based subgraph matching in biological networks. In: EDBT, pp. 192–203 (2009)Google Scholar
  17. 17.
    Zhang, S., Yang, J., Jin, W.: Sapper: Subgraph indexing and approximate matching in large graphs. In: VLDB (2010)Google Scholar
  18. 18.
    Zhao, P., Han, J.: On graph query optimization in large networks. PVLDB 3(1), 340–351 (2010)Google Scholar
  19. 19.
    Zhao, P., Yu, J.X., Yu, P.S.: Graph Indexing: Tree + Delta >= Graph. In: VLDB, pp. 938–949 (2007)Google Scholar
  20. 20.
    Zou, L., Chen, L., Yu, J.X., Lu, Y.: A novel spectral coding in a large graph database. In: EDBT, pp. 181–192 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Gaoping Zhu
    • 1
  • Ke Zhu
    • 1
  • Wenjie Zhang
    • 1
  • Xuemin Lin
    • 1
  • Chuan Xiao
    • 1
  1. 1.The University of New South WalesSydneyAustralia

Personalised recommendations