Abstract
This paper proposes an efficient approach to subgraph search over a large graph database under the MapReduce framework. The main idea is first to build inverted edge indexes for graphs in the database, and then to retrieve data only related to the query subgraph by using the built indexes to answer the query. Experimental results show that the proposed approach has good performance and scalability.
This work was supported by National Natural Science Foundation of China under grants No. 60873040 and No. 60873070. Jihong Guan was also supported by the Shuguang Scholar Program of Shanghai Education Development Foundation under grant No. 09SG23.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C., Wang, H. (eds.): Managing and mining graph data. Kluwer Academic Publishers, Dordrecht (2010)
Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph. In: Proceedings of SDM (2004)
Willett, P.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38, 983–996 (1998)
Polyzotis, N., Garofalakis, M.: Statistical Synopses for Graph-Structured XML Databases. In: Proceedings of SIGMOD (2002)
Beretti, S., Bimbo, A., Vicario, E.: Efficient Matching and Indexing of Graph Models in Content Based Retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence 23, 1089–1105 (2001)
Messmer, B., Bunke, H.: A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. on Pattern Analysis and Machine Intelligence 20, 493–504 (1998)
Petrakis, E., Faloutsos, C.: Similarity searching in medical image databases. IEEE Trans. on Knowledge and Data Engineering 9(3), 435–447 (1997)
Yan, X., Yu, P., Han, J.: Graph Indexing Based on Discriminative Frequent Structure Analysis. ACM Transactions on Database Systems 30(4), 960–993 (2005)
Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graph databases. In: Proceedings of SIGMOD (2007)
Williams, D.W., Huan, J., Wang, W.: Graph Database Indexing Using Structured Graph Decomposition. In: Proceedings of ICDE (2007)
He, H., Singh, A.K.: Closure-Tree.: An Index Structure for Graph Queries. In: Proceedings of ICDE (2006)
Giugno, R., Shasha, D.: Graphgrep: A fast and universal method for querying graphs. Proceedings of ICPR 2, 112–115 (2002)
Zhang, S., Hu, M., Yang, J.: TreePi: A Novel Graph Indexing Method. In: Proceedings of ICDE, pp. 181–192 (2007)
Ferro, A., Giugno, R., Mongiovi, M., et al.: GraphFind: enhancing graph searching by low support data mining techniques. BMC Bioinformatics 9 (2008)
Jiang, H., Wang, H., Yu, P., Zhou, S.: GString: A Novel Approach for Efficient Search in Graph Databases. In: Proceedings of ICDE (2007)
Zou, L., Chen, L., Jeffrey, Y.L.: A novel spectral coding in a large graph database. In: Proceedings of EDBT (2006)
Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: Above the Clouds: A Berkeley View of Cloud Computing. Technical Report, UC Berkeley Reliable Adaptive Distributed Systems Laboratory (February 2009)
Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large cluster. In: Proceedings of OSDI, pp. 137–150 (2004)
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of SOSP, pp. 29–43 (2003)
Olston, C., Reed, B., Srivastava, U., et al.: Pig latin: a not-so-foreign language for data processing. In: Proceedings of SIGMOD, pp. 285–296 (2008)
Abouzeid, A., Pawlikowski, K.B., Abadi, D.J., et al.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In: Proceedings of VLDB, pp. 285–296 (2009)
Gu, Y., Lu, L., Grossman, R., Yoo, A.: Processing massive sized graphs using Sector/Sphere. In: Proceedings of the Workshop on Many-task Computing on Grids and Supercomputers (MTAGS 2010), co-located with SC 2010, New Orleans, LA (November 2010)
Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations, In: Proceedings of ICDM 2009 (2009)
Kang, U., Tsourakakis, C.E., Appel, A., Faloutsos, C., Leskovec, J.: HADI: Fast diameter estimation and mining in massive graphs with Hadoop, CMU ML Tech Report CMU-ML-08-117 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, Y., Guan, J., Zhou, S. (2011). Towards Efficient Subgraph Search in Cloud Computing Environments. In: Xu, J., Yu, G., Zhou, S., Unland, R. (eds) Database Systems for Adanced Applications. DASFAA 2011. Lecture Notes in Computer Science, vol 6637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20244-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-20244-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20243-8
Online ISBN: 978-3-642-20244-5
eBook Packages: Computer ScienceComputer Science (R0)