Abstract
MapReduce is a programming model widely used in Cloud computing environments for processing large data sets in a highly parallel way. MapReduce implementations are based on a master-slave model. The failure of a slave is managed by re-assigning its task to another slave, while master failures are not managed by current MapReduce implementations, as designers consider failures unlikely in reliable Cloud systems. On the contrary, node failures – including master failures – are likely to happen in dynamic Cloud scenarios, where computing nodes may join and leave the network at an unpredictable rate. Therefore, providing effective mechanisms to manage master failures is fundamental to exploit the MapReduce model in the implementation of data-intensive applications in those dynamic Cloud environments where current MapReduce implementations could be unreliable. The goal of our work is to extend the master-slave architecture of current MapReduce implementations to make it more suitable for dynamic Cloud scenarios. In particular, in this chapter, we present a Peer-to-Peer (P2P)-MapReduce framework that exploits a P2P model to manage participation of intermittent nodes, master failures, and MapReduce job recovery in a decentralized but effective way.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Google’s Map Reduce (2009). http://labs.google.com/papers/mapreduce.html (Visited: September 2009)
Hadoop (2009) http://hadoop.apache.org (Visited: September 2009)
Marozzo F, Talia D, Trunfio P (2008) Adapting MapReduce for dynamic environments using a peer-to-peer model. Workshop on cloud computing and its applications, Chicago, USA
Gridgain (2009) http://www.gridgain.com (Visited: September 2009)
Skynet (2009) http://skynet.rubyforge.org (Visited: September 2009)
MapSharp (2009) http://mapsharp.codeplex.com (Visited: September 2009)
Disco (2009) http://discoproject.org (Visited: September 2009)
Gu Y, Grossman R (2009) Sector and sphere: the design and implementation of a high performance data cloud. Philos Tr S A 367(1897):2429–2445
Grossman R, Gu Y (2008) Data mining using high performance data clouds: experimental studies using sector and sphere. SIGKDD 2008, Las Vegas, USA
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. Symposium on Operating Systems Design and Implementation (OSDI), San Francisco, USA
Gong L (2001) JXTA: a network programming environment. IEEE Internet Comput 5(3):88–95
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer London
About this chapter
Cite this chapter
Marozzo, F., Talia, D., Trunfio, P. (2010). A Peer-to-Peer Framework for Supporting MapReduce Applications in Dynamic Cloud Environments. In: Antonopoulos, N., Gillam, L. (eds) Cloud Computing. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-1-84996-241-4_7
Download citation
DOI: https://doi.org/10.1007/978-1-84996-241-4_7
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-84996-240-7
Online ISBN: 978-1-84996-241-4
eBook Packages: Computer ScienceComputer Science (R0)