Skip to main content

Benchmarking Virtualized Hadoop Clusters

  • Conference paper
  • First Online:
Big Data Benchmarking (WBDB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8991))

Included in the following conference series:

Abstract

This work investigates the performance of Big Data applications in virtualized Hadoop environments, hosted on a single physical node. An evaluation and performance comparison of applications running on a virtualized Hadoop cluster with separated data and computation layers against standard Hadoop installation is presented. Our experiments show how different Data-Compute Hadoop cluster configurations, utilizing the same virtualized resources, can influence the performance of CPU bound and I/O bound workloads. Based on our observations, we identify three important factors that should be considered when configuring and provisioning virtualized Hadoop clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://hadoop.apache.org/.

  2. 2.

    http://aws.amazon.com/elasticmapreduce/.

  3. 3.

    http://www.projectserengeti.org/.

  4. 4.

    https://wiki.openstack.org/wiki/Sahara/.

  5. 5.

    http://www.openstack.org/.

  6. 6.

    http://www.vmware.com/products/vsphere/.

  7. 7.

    http://www.vmware.com/products/big-data-extensions/.

References

  1. Buell, J.: Virtualized Hadoop Performance with VMware vSphere 5.1. Tech. White Pap. VMware Inc. (2013)

    Google Scholar 

  2. Huang, S., et al.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51. IEEE (2010)

    Google Scholar 

  3. Li, J., et al.: Performance overhead among three hypervisors: an experimental study using hadoop benchmarks. In: 2013 IEEE International Congress on Big Data (BigData Congress), pp. 9–16. IEEE (2013)

    Google Scholar 

  4. Li, X., Murray, J.: Deploying Virtualized Hadoop Systems with VMWare vSphere Big Data Extensions. Tech. White Pap. VMware Inc. (2014)

    Google Scholar 

  5. Magdon-Ismail, T., et al.: Toward an elastic elephant enabling hadoop for the Cloud. VMware Tech. J. (2013)

    Google Scholar 

  6. Microsoft: Performance of Hadoop on Windows in Hyper-V Environments. Tech. White Pap. Microsoft (2013)

    Google Scholar 

  7. Rimal, B.P., et al.: A taxonomy and survey of cloud computing systems. In: Fifth International Joint Conference on INC, IMS and IDC, NCM 2009, pp. 44–51. Ieee (2009)

    Google Scholar 

  8. Schmidt, R., Mohring, M.: Strategic alignment of cloud-based architectures for big data. In: 2013 17th IEEE International Enterprise Distributed Object Computing Conference Workshops (EDOCW), pp. 136–143 IEEE (2013)

    Google Scholar 

  9. VMWare: Virtualized Hadoop Performance with VMware vSphere ®5.1. Tech. White Pap. VMware Inc. (2013)

    Google Scholar 

  10. Ye, K., et al.: vHadoop: a scalable hadoop virtual cluster platform for MapReduce-based parallel machine learning with performance consideration. In: 2012 IEEE International Conference on Cluster Computing Workshops (Cluster Workshops), pp. 152–160. IEEE (2012)

    Google Scholar 

Download references

Acknowledgments

We would like to thank Jeffrey Buell of VMware for providing a useful feedback on an early version of this paper and Nikolaos Korfiatis for his helpful comments and support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Todor Ivanov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ivanov, T., Zicari, R.V., Buchmann, A. (2015). Benchmarking Virtualized Hadoop Clusters. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20233-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20232-7

  • Online ISBN: 978-3-319-20233-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics