Cloud Computing pp 113-125

Part of the Computer Communications and Networks book series (CCN)

A Peer-to-Peer Framework for Supporting MapReduce Applications in Dynamic Cloud Environments

Chapter

Abstract

MapReduce is a programming model widely used in Cloud computing environments for processing large data sets in a highly parallel way. MapReduce implementations are based on a master-slave model. The failure of a slave is managed by re-assigning its task to another slave, while master failures are not managed by current MapReduce implementations, as designers consider failures unlikely in reliable Cloud systems. On the contrary, node failures – including master failures – are likely to happen in dynamic Cloud scenarios, where computing nodes may join and leave the network at an unpredictable rate. Therefore, providing effective mechanisms to manage master failures is fundamental to exploit the MapReduce model in the implementation of data-intensive applications in those dynamic Cloud environments where current MapReduce implementations could be unreliable. The goal of our work is to extend the master-slave architecture of current MapReduce implementations to make it more suitable for dynamic Cloud scenarios. In particular, in this chapter, we present a Peer-to-Peer (P2P)-MapReduce framework that exploits a P2P model to manage participation of intermittent nodes, master failures, and MapReduce job recovery in a decentralized but effective way.

References

  1. 1.
    Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRefGoogle Scholar
  2. 2.
    Google’s Map Reduce (2009). http://labs.google.com/papers/mapreduce.html (Visited: September 2009)
  3. 3.
    Hadoop (2009) http://hadoop.apache.org (Visited: September 2009)
  4. 4.
    Marozzo F, Talia D, Trunfio P (2008) Adapting MapReduce for dynamic environments using a peer-to-peer model. Workshop on cloud computing and its applications, Chicago, USAGoogle Scholar
  5. 5.
    Gridgain (2009) http://www.gridgain.com (Visited: September 2009)
  6. 6.
    Skynet (2009) http://skynet.rubyforge.org (Visited: September 2009)
  7. 7.
    MapSharp (2009) http://mapsharp.codeplex.com (Visited: September 2009)
  8. 8.
    Disco (2009) http://discoproject.org (Visited: September 2009)
  9. 9.
    Gu Y, Grossman R (2009) Sector and sphere: the design and implementation of a high performance data cloud. Philos Tr S A 367(1897):2429–2445CrossRefGoogle Scholar
  10. 10.
    Grossman R, Gu Y (2008) Data mining using high performance data clouds: experimental studies using sector and sphere. SIGKDD 2008, Las Vegas, USAGoogle Scholar
  11. 11.
    Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. Symposium on Operating Systems Design and Implementation (OSDI), San Francisco, USAGoogle Scholar
  12. 12.
    Gong L (2001) JXTA: a network programming environment. IEEE Internet Comput 5(3):88–95CrossRefGoogle Scholar

Copyright information

© Springer London 2010

Authors and Affiliations

  1. 1.Department of Electronics, Computer Science and Systems (DEIS)University of CalabriaRendeItaly

Personalised recommendations