Abstract
This work investigates the performance of Big Data applications in virtualized Hadoop environments, hosted on a single physical node. An evaluation and performance comparison of applications running on a virtualized Hadoop cluster with separated data and computation layers against standard Hadoop installation is presented. Our experiments show how different Data-Compute Hadoop cluster configurations, utilizing the same virtualized resources, can influence the performance of CPU bound and I/O bound workloads. Based on our observations, we identify three important factors that should be considered when configuring and provisioning virtualized Hadoop clusters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Buell, J.: Virtualized Hadoop Performance with VMware vSphere 5.1. Tech. White Pap. VMware Inc. (2013)
Huang, S., et al.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51. IEEE (2010)
Li, J., et al.: Performance overhead among three hypervisors: an experimental study using hadoop benchmarks. In: 2013 IEEE International Congress on Big Data (BigData Congress), pp. 9–16. IEEE (2013)
Li, X., Murray, J.: Deploying Virtualized Hadoop Systems with VMWare vSphere Big Data Extensions. Tech. White Pap. VMware Inc. (2014)
Magdon-Ismail, T., et al.: Toward an elastic elephant enabling hadoop for the Cloud. VMware Tech. J. (2013)
Microsoft: Performance of Hadoop on Windows in Hyper-V Environments. Tech. White Pap. Microsoft (2013)
Rimal, B.P., et al.: A taxonomy and survey of cloud computing systems. In: Fifth International Joint Conference on INC, IMS and IDC, NCM 2009, pp. 44–51. Ieee (2009)
Schmidt, R., Mohring, M.: Strategic alignment of cloud-based architectures for big data. In: 2013 17th IEEE International Enterprise Distributed Object Computing Conference Workshops (EDOCW), pp. 136–143 IEEE (2013)
VMWare: Virtualized Hadoop Performance with VMware vSphere ®5.1. Tech. White Pap. VMware Inc. (2013)
Ye, K., et al.: vHadoop: a scalable hadoop virtual cluster platform for MapReduce-based parallel machine learning with performance consideration. In: 2012 IEEE International Conference on Cluster Computing Workshops (Cluster Workshops), pp. 152–160. IEEE (2012)
Acknowledgments
We would like to thank Jeffrey Buell of VMware for providing a useful feedback on an early version of this paper and Nikolaos Korfiatis for his helpful comments and support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ivanov, T., Zicari, R.V., Buchmann, A. (2015). Benchmarking Virtualized Hadoop Clusters. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-20233-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20232-7
Online ISBN: 978-3-319-20233-4
eBook Packages: Computer ScienceComputer Science (R0)