Skip to main content

Formal Derivation of Distributed MapReduce

  • Conference paper
Abstract State Machines, Alloy, B, TLA, VDM, and Z (ABZ 2014)

Abstract

MapReduce is a powerful distributed data processing model that is currently adopted in a wide range of domains to efficiently handle large volumes of data, i.e., cope with the big data surge. In this paper, we propose an approach to formal derivation of the MapReduce framework. Our approach relies on stepwise refinement in Event-B and, in particular, the event refinement structure approach – a diagrammatic notation facilitating formal development. Our approach allows us to derive the system architecture in a systematic and well-structured way. The main principle of MapReduce is to parallelise processing of data by first mapping them to multiple processing nodes and then merging the results. To facilitate this, we formally define interdependencies between the map and reduce stages of MapReduce. This formalisation allows us to propose an alternative architectural solution that weakens blocking between the stages and, as a result, achieves a higher degree of parallelisation of MapReduce computations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abrial, J.R.: Modeling in Event-B. Cambridge University Press (2010)

    Google Scholar 

  2. Borthakur, D.: The Hadoop Distributed File System: Architecture and Design. The Apache Software Foundation (2007)

    Google Scholar 

  3. Butler, M.: Decomposition Structures for Event-B. In: Leuschel, M., Wehrheim, H. (eds.) IFM 2009. LNCS, vol. 5423, pp. 20–38. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M.: MapReduce Online. In: NSDI 2010, p. 20. USENIX Association (2010)

    Google Scholar 

  5. Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, pp. 137–150. USENIX Association (2004)

    Google Scholar 

  6. Fathabadi, A.S., Butler, M., Rezazadeh, A.: A Systematic Approach to Atomicity Decomposition in Event-B. In: Eleftherakis, G., Hinchey, M., Holcombe, M. (eds.) SEFM 2012. LNCS, vol. 7504, pp. 78–93. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Ono, K., Hirai, Y., Tanabe, Y., Noda, N., Hagiya, M.: Using Coq in Specification and Program Extraction of Hadoop MapReduce Applications. In: Barthe, G., Pardo, A., Schneider, G. (eds.) SEFM 2011. LNCS, vol. 7041, pp. 350–365. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Pereverzeva, I., Butler, M., Fathabadi, A.S., Laibinis, L., Troubitsyna, E.: Formal Derivation of Distributed MapReduce. Tech. Rep. 1099, TUCS (2014)

    Google Scholar 

  9. Lämmel, R.: Google’s MapReduce programming model 70, 1–30 (2008)

    Google Scholar 

  10. Rodin: Event-B Platform, http://www.event-b.org/

  11. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive - A Warehousing Solution Over a Map-Reduce Framework. Proc. VLDB Endowment 2, 1626–1629 (2009)

    Google Scholar 

  12. Yang, F., Su, W., Zhu, H., Li, Q.: Formalizing MapReduce with CSP. In: 17th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems, pp. 358–367. IEEE Computer Society (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pereverzeva, I., Butler, M., Fathabadi, A.S., Laibinis, L., Troubitsyna, E. (2014). Formal Derivation of Distributed MapReduce. In: Ait Ameur, Y., Schewe, KD. (eds) Abstract State Machines, Alloy, B, TLA, VDM, and Z. ABZ 2014. Lecture Notes in Computer Science, vol 8477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43652-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43652-3_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43651-6

  • Online ISBN: 978-3-662-43652-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics