MapReduce Applications in the Cloud: A Cost Evaluation of Computation and Storage
- Cite this paper as:
- Moise D., Carpen-Amarie A. (2012) MapReduce Applications in the Cloud: A Cost Evaluation of Computation and Storage. In: Hameurlain A., Hussain F.K., Morvan F., Tjoa A.M. (eds) Data Management in Cloud, Grid and P2P Systems. Globe 2012. Lecture Notes in Computer Science, vol 7450. Springer, Berlin, Heidelberg
MapReduce is a powerful paradigm that enables rapid implementation of a wide range of distributed data-intensive applications. The Hadoop project, its main open source implementation, has recently been widely adopted by the Cloud computing community. This paper aims to evaluate the cost of moving MapReduce applications to the Cloud, in order to find a proper trade-off between cost and performance for this class of applications. We provide a cost evaluation of running MapReduce applications in the Cloud, by looking into two aspects: the overhead implied by the execution of MapReduce jobs in the Cloud, compared to an execution on a Grid, and the actual costs of renting the corresponding Cloud resources. For our evaluation, we compared the runtime of 3 MapReduce applications executed with the Hadoop framework, in two environments: 1)on clusters belonging to the Grid’5000 experimental grid testbed and 2)in a Nimbus Cloud deployed on top of Grid’5000 nodes.
Unable to display preview. Download preview PDF.