Skip to main content

Using Coq in Specification and Program Extraction of Hadoop MapReduce Applications

  • Conference paper
Software Engineering and Formal Methods (SEFM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7041))

Included in the following conference series:

Abstract

Hadoop MapReduce is a framework for distributed computation on key-value pairs. The goal of this research is to verify actual running code of MapReduce applications. We first constructed an abstract model of MapReduce computation with the proof assistant Coq. In the model, mappers and reducers in MapReduce computation are modeled as functions in Coq, and a specification of a MapReduce application is expressed in terms of invariants among functions involving its mapper and reducer. The model also provides modular proofs of lemmas that do not depend on applications. To achieve the goal, we investigated the feasibility of two approaches. In one approach, we transformed verified mapper and reducer functions into Haskell programs and executed them under Hadoop Streaming. In the other approach, we verified JML annotations on Java programs of the mapper and reducer using Krakatoa, translated them into Coq axioms, and proved Coq specifications from them. In either approach, we were able to verify correctness of MapReduce applications that actually run on the Hadoop MapReduce framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bertot, Y., Casteran, P.: Interactive Theorem Proving and Program Development. Springer, Heidelberg (2004)

    Book  MATH  Google Scholar 

  2. Chalin, P., Kiniry, J., Leavens, G., Poll, E.: Beyond assertions: Advanced specification and verification with JML and ESC/Java2. In: de Boer, F., Bonsangue, M., Graf, S., de Roever, W.P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 342–363. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Chrząszcz, J.: Implementing modules in the Coq system. In: Basin, D., Wolff, B. (eds.) TPHOLs 2003. LNCS, vol. 2758, pp. 270–286. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)

    Article  Google Scholar 

  5. Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. J. ACM 52, 365–473 (2005)

    Article  MATH  Google Scholar 

  6. Dörre, J., Apel, S., Lengauer, C.: Static type checking of Hadoop MapReduce. In: MapReduce 2011. ACM, New York (to appear, 2011)

    Google Scholar 

  7. Hübel, T.: The Holumbus Framework. Master’s thesis, Wedel University of Applied Sciences (2008)

    Google Scholar 

  8. Jimmy Lin, C.D.: Data-Intensive Text Processing with MapReduce. Morgan and Claypool (2010)

    Google Scholar 

  9. Lämmel, R.: Google’s MapReduce programming model – revisited. Science of Computer Programming 70(1), 1–30 (2008)

    Article  MATH  Google Scholar 

  10. Leavens, G.T., Poll, E., Clifton, C., Cheon, Y., Ruby, C., Cok, D., Muller, P., Kiniry, J., Chalin, P., Zimmerman, D.M.: JML reference manual (2011), http://www.eecs.ucf.edu/~leavens/JML/documentation.shtml

  11. Marché, C., Paulin-Mohring, C., Urbain, X.: The KRAKATOA tool for certificationof JAVA/JAVACARD programs annotated in JML. Journal of Logic and Algebraic Programming 58(1-2), 89–106 (2004)

    Article  MATH  Google Scholar 

  12. Marlow, S.: Haskell 2010 language report (2010), http://www.haskell.org/onlinereport/haskell2010/

  13. de Moura, L., Bjøner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2009)

    Google Scholar 

  15. Yang, F., Su, W., Zhu, H., Li, Q.: Formalizing MapReduce with CSP. In: Proceedings of the 2010 17th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems, ECBS 2010, pp. 358–367. IEEE, Los Alamitos (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ono, K., Hirai, Y., Tanabe, Y., Noda, N., Hagiya, M. (2011). Using Coq in Specification and Program Extraction of Hadoop MapReduce Applications. In: Barthe, G., Pardo, A., Schneider, G. (eds) Software Engineering and Formal Methods. SEFM 2011. Lecture Notes in Computer Science, vol 7041. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24690-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24690-6_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24689-0

  • Online ISBN: 978-3-642-24690-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics