Skip to main content

Bisimulation Reduction of Big Graphs on MapReduce

  • Conference paper
Big Data (BNCOD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7968))

Included in the following conference series:

Abstract

Computing the bisimulation partition of a graph is a fundamental problem which plays a key role in a wide range of basic applications. Intuitively, two nodes in a graph are bisimilar if they share basic structural properties such as labeling and neighborhood topology. In data management, reducing a graph under bisimulation equivalence is a crucial step, e.g., for indexing the graph for efficient query processing. Often, graphs of interest in the real world are massive; examples include social networks and linked open data. For analytics on such graphs, it is becoming increasingly infeasible to rely on in-memory or even I/O-efficient solutions. Hence, a trend in Big Data analytics is the use of distributed computing frameworks such as MapReduce. While there are both internal and external memory solutions for efficiently computing bisimulation, there is, to our knowledge, no effective MapReduce-based solution for bisimulation. Motivated by these observations we propose in this paper the first efficient MapReduce-based algorithm for computing the bisimulation partition of massive graphs. We also detail several optimizations for handling the data skew which often arises in real-world graphs. The results of an extensive empirical study are presented which demonstrate the effectiveness and scalability of our solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afrati, F.N., Sarma, A.D., Salihoglu, S., Ullman, J.D.: Upper and lower bounds on the cost of a Map-Reduce computation. CoRR, abs/1206.4377 (2012)

    Google Scholar 

  2. Bader, D.A., Madduri, K.: GTgraph: A suite of synthetic graph generators, http://www.cse.psu.edu/~madduri/software/GTgraph/index.html

  3. Blom, S., Orzan, S.: A distributed algorithm for strong bisimulation reduction of state spaces. Int. J. Softw. Tools Technol. Transfer 7, 74–86 (2005)

    Article  Google Scholar 

  4. Buneman, P., Grohe, M., Koch, C.: Path queries on compressed XML. In: Proc. VLDB, Berlin, Germany, pp. 141–152 (2003)

    Google Scholar 

  5. Clauset, A., Shalizi, C., Newman, M.: Power-law distributions in empirical data. SIAM Review 51(4), 661–703 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cohen, J.: Graph twiddling in a MapReduce world. Computing in Science Engineering 11(4), 29–41 (2009)

    Article  Google Scholar 

  7. de Lange, Y.: MapReduce based algorithms for localized bisimulation. Master’s thesis, Eindhoven University of Technology (2013)

    Google Scholar 

  8. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  9. Dovier, A., Piazza, C., Policriti, A.: An efficient algorithm for computing bisimulation equivalence. Theor. Comp. Sci. 311(1-3), 221–256 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  10. Fan, W.: Graph pattern matching revised for social network analysis. In: Proc. ICDT, Berlin, Germany, pp. 8–21 (2012)

    Google Scholar 

  11. Fan, W., Li, J., Wang, X., Wu, Y.: Query preserving graph compression. In: Proc. SIGMOD, Scottsdale, AZ, USA, pp. 157–168 (2012)

    Google Scholar 

  12. Gufler, B., Augsten, N., Reiser, A., Kemper, A.: Handling data skew in MapReduce. In: Proc. CLOSER, pp. 574–583 (2011)

    Google Scholar 

  13. Gufler, B., Augsten, N., Reiser, A., Kemper, A.: Load balancing in MapReduce based on scalable cardinality estimates. In: Proc. ICDE, pp. 522–533 (2012)

    Google Scholar 

  14. Hadoop (2012), http://hadoop.apache.org/

  15. Ibrahim, S., Jin, H., Lu, L., Wu, S., He, B., Qi, L.: LEEN: Locality/fairness-aware key partitioning for MapReduce in the cloud. In: CloudCom, pp. 17–24 (2010)

    Google Scholar 

  16. Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. In: ICDE, San Jose, pp. 129–140 (2002)

    Google Scholar 

  17. Kwon, Y., Ren, K., Balazinska, M., Howe, B.: Managing Skew in Hadoop. IEEE Data Eng. Bull. 36(1), 24–33 (2013)

    Google Scholar 

  18. Lin, J., Dyer, C.: Data-Intensive Text Processing with MapReduce. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2010)

    Google Scholar 

  19. Lin, J., Schatz, M.: Design patterns for efficient graph algorithms in MapReduce. In: Proc. MLG, Washington, D.C., pp. 78–85 (2010)

    Google Scholar 

  20. Luo, Y., Fletcher, G.H.L., Hidders, J., Wu, Y., De Bra, P.: I/O-efficient algorithms for localized bisimulation partition construction and maintenance on massive graphs. CoRR, abs/1210.0748 (2012)

    Google Scholar 

  21. Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  22. Paige, R., Tarjan, R.: Three partition refinement algorithms. SIAM J. Comput. 16, 973 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  23. Picalausa, F., Luo, Y., Fletcher, G.H.L., Hidders, J., Vansummeren, S.: A structural approach to indexing triples. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 406–421. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  24. Sangiorgi, D., Rutten, J.: Advanced Topics in Bisimulation and Coinduction. Cambridge University Press, New York (2011)

    Book  Google Scholar 

  25. Vernica, R., Balmin, A., Beyer, K.S., Ercegovac, V.: Adaptive MapReduce using situation-aware mappers. In: Proc. EDBT, pp. 420–431 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Luo, Y., de Lange, Y., Fletcher, G.H.L., De Bra, P., Hidders, J., Wu, Y. (2013). Bisimulation Reduction of Big Graphs on MapReduce. In: Gottlob, G., Grasso, G., Olteanu, D., Schallhart, C. (eds) Big Data. BNCOD 2013. Lecture Notes in Computer Science, vol 7968. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39467-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39467-6_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39466-9

  • Online ISBN: 978-3-642-39467-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics