Towards Efficient Subgraph Search in Cloud Computing Environments

  • Yifeng Luo
  • Jihong Guan
  • Shuigeng Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6637)


This paper proposes an efficient approach to subgraph search over a large graph database under the MapReduce framework. The main idea is first to build inverted edge indexes for graphs in the database, and then to retrieve data only related to the query subgraph by using the built indexes to answer the query. Experimental results show that the proposed approach has good performance and scalability.


Graph database Subgraph search Cloud computing MapReduce Inverted index 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, C.C., Wang, H. (eds.): Managing and mining graph data. Kluwer Academic Publishers, Dordrecht (2010)zbMATHGoogle Scholar
  2. 2.
    Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph. In: Proceedings of SDM (2004)Google Scholar
  3. 3.
    Willett, P.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38, 983–996 (1998)CrossRefGoogle Scholar
  4. 4.
    Polyzotis, N., Garofalakis, M.: Statistical Synopses for Graph-Structured XML Databases. In: Proceedings of SIGMOD (2002)Google Scholar
  5. 5.
    Beretti, S., Bimbo, A., Vicario, E.: Efficient Matching and Indexing of Graph Models in Content Based Retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence 23, 1089–1105 (2001)CrossRefGoogle Scholar
  6. 6.
    Messmer, B., Bunke, H.: A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. on Pattern Analysis and Machine Intelligence 20, 493–504 (1998)CrossRefGoogle Scholar
  7. 7.
    Petrakis, E., Faloutsos, C.: Similarity searching in medical image databases. IEEE Trans. on Knowledge and Data Engineering 9(3), 435–447 (1997)CrossRefGoogle Scholar
  8. 8.
    Yan, X., Yu, P., Han, J.: Graph Indexing Based on Discriminative Frequent Structure Analysis. ACM Transactions on Database Systems 30(4), 960–993 (2005)CrossRefGoogle Scholar
  9. 9.
    Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graph databases. In: Proceedings of SIGMOD (2007)Google Scholar
  10. 10.
    Williams, D.W., Huan, J., Wang, W.: Graph Database Indexing Using Structured Graph Decomposition. In: Proceedings of ICDE (2007)Google Scholar
  11. 11.
    He, H., Singh, A.K.: Closure-Tree.: An Index Structure for Graph Queries. In: Proceedings of ICDE (2006)Google Scholar
  12. 12.
    Giugno, R., Shasha, D.: Graphgrep: A fast and universal method for querying graphs. Proceedings of ICPR 2, 112–115 (2002)Google Scholar
  13. 13.
    Zhang, S., Hu, M., Yang, J.: TreePi: A Novel Graph Indexing Method. In: Proceedings of ICDE, pp. 181–192 (2007)Google Scholar
  14. 14.
    Ferro, A., Giugno, R., Mongiovi, M., et al.: GraphFind: enhancing graph searching by low support data mining techniques. BMC Bioinformatics 9 (2008)Google Scholar
  15. 15.
    Jiang, H., Wang, H., Yu, P., Zhou, S.: GString: A Novel Approach for Efficient Search in Graph Databases. In: Proceedings of ICDE (2007)Google Scholar
  16. 16.
    Zou, L., Chen, L., Jeffrey, Y.L.: A novel spectral coding in a large graph database. In: Proceedings of EDBT (2006)Google Scholar
  17. 17.
    Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: Above the Clouds: A Berkeley View of Cloud Computing. Technical Report, UC Berkeley Reliable Adaptive Distributed Systems Laboratory (February 2009)Google Scholar
  18. 18.
    Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large cluster. In: Proceedings of OSDI, pp. 137–150 (2004)Google Scholar
  19. 19.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of SOSP, pp. 29–43 (2003)Google Scholar
  20. 20.
    Olston, C., Reed, B., Srivastava, U., et al.: Pig latin: a not-so-foreign language for data processing. In: Proceedings of SIGMOD, pp. 285–296 (2008)Google Scholar
  21. 21.
    Abouzeid, A., Pawlikowski, K.B., Abadi, D.J., et al.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In: Proceedings of VLDB, pp. 285–296 (2009)Google Scholar
  22. 22.
  23. 23.
  24. 24.
    Gu, Y., Lu, L., Grossman, R., Yoo, A.: Processing massive sized graphs using Sector/Sphere. In: Proceedings of the Workshop on Many-task Computing on Grids and Supercomputers (MTAGS 2010), co-located with SC 2010, New Orleans, LA (November 2010)Google Scholar
  25. 25.
    Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations, In: Proceedings of ICDM 2009 (2009)Google Scholar
  26. 26.
    Kang, U., Tsourakakis, C.E., Appel, A., Faloutsos, C., Leskovec, J.: HADI: Fast diameter estimation and mining in massive graphs with Hadoop, CMU ML Tech Report CMU-ML-08-117 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yifeng Luo
    • 1
    • 2
  • Jihong Guan
    • 3
  • Shuigeng Zhou
    • 1
    • 2
  1. 1.School of Computer ScienceFudan UniversityShanghaiChina
  2. 2.Shanghai Key Lab of Intelligent Information ProcessingFudan UniversityShanghaiChina
  3. 3.Dept. of Computer Science & TechnologyTongji UniversityShanghaiChina

Personalised recommendations