Skip to main content

Influence of the VM Manager on Private Cluster Data Mining System

  • Conference paper
Computer Networks (CN 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 431))

Included in the following conference series:

Abstract

In this paper the comparative analysis of the impact of the virtual machine manager on the Private Cluster Data Mining System performance was shown. The idea of the research is comparison of the performance of Data Mining System under different VM’s and different operating systems. In this article the results obtained in a test environment, based on the Ubuntu Desktop 13.10 x64 used as host OS and Cloudera Hadoop distribution used as personal cluster guest OS, were shown. The main focus of the research is the hypervisor impact on the typical operations in data mining system, such as parallelized calculation, file system operations and the use of CPU resources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters (December 2013), http://research.google.com/archive/mapreduce.html

  2. Welcome to Hadoop Apache (December 2013), http://hadoop.apache.org

  3. Borzemski, L.: Data Mining in Evaluation of Internet Path Performance. In: Orchard, B., Yang, C., Ali, M. (eds.) IEA/AIE 2004. LNCS (LNAI), vol. 3029, pp. 643–652. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Gruca, A., Czachórski, T., Kozielski, S. (eds.): Man-Machine Interactions 3. AISC, vol. 242. Springer, Heidelberg (2014)

    Google Scholar 

  5. Bembenik, R., Skonieczny, L., Rybiński, H., Niezgodka, M. (eds.): Intelligent Tools for Building a Scient. Info. Plat. SCI, vol. 390. Springer, Heidelberg (2012)

    Google Scholar 

  6. Yao, K.-T., Lucas, R.F., Ward, C.E., Wagenbreth, G., Gottschalk, T.D.: Data Analysis for Massively Distributed Simulations. In: Interservice/Industry Training, Simulation, and Education Conference IITSEC 2009 Paper No. 9350 (2009)

    Google Scholar 

  7. Hadoop Blog (December 2013), http://developer.yahoo.com/blogs/hadoop/

  8. Krauzowicz, Ł., Szostek, K., Dwornik, M., Oleksik, P., Piórkowski, A.: Numerical Calculations for Geophysics Inversion Problem Using Apache Hadoop Technology. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2012. CCIS, vol. 291, pp. 440–447. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Lam, C.: Hadoop in Action. Manning Publications Co. (2011)

    Google Scholar 

  10. Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The Eucalyptus Open-source Cloud-computing System. In: 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID), pp. 124–131 (2009)

    Google Scholar 

  11. Donnelly, P., Bui, P., Thain, D.: Attaching Cloud Storage to a Campus Grid Using Parrot, Chirp, and Hadoop. In: IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 488–495 (2010)

    Google Scholar 

  12. Czerwiński, D., Przyłucki, S., Sawicki, D.: Comparison of cloud and virtualization systems. Drives and Control 13(11), 93–103 (2011)

    Google Scholar 

  13. Cooper, B.: The Prickly Side of Building Clouds. IEEE Internet Computing 14(6), 64–67 (2010)

    Article  Google Scholar 

  14. CDH Version and Packaging Information – Cloudera Support (December 2013), https://ccp.cloudera.com/display/DOC/CDH+Version+and+Packaging+Information

  15. Wheeler, T.: Testing Hadoop Applications. In: O’Reilly Strata Conference, New York (October 2012)

    Google Scholar 

  16. White, T.: Hadoop: The Definitive Guide, MapReduce for the Cloud. O’Reilly Media (2009)

    Google Scholar 

  17. Holmes, A.: Hadoop in Practice. Manning Publications Co. (October 2012)

    Google Scholar 

  18. Li, J., Wang, Q., Jayasinghe, D., Park, J., Zhu, T., Pu, C.: Performance Overhead Among Three Hypervisors: An Experimental Study using Hadoop Benchmarks. In: IEEE International Congress on Big Data, pp. 9–16 (June/July 2013)

    Google Scholar 

  19. Yang, Y., Xiang, L., Xiaoqiang, D., Chengjian, W.: Impacts of Virtualization Technologies on Hadoop. In: Third International Conference on Intelligent System Design and Engineering Applications, Hong Kong, pp. 846–849 (January 2013)

    Google Scholar 

  20. Qingye, J.: Virtual Machine Performance Comparison of Public IaaS Providers in China. In: 2012 IEEE Asia Pacific Cloud Computing Congress, pp. 16–19 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Czerwinski, D. (2014). Influence of the VM Manager on Private Cluster Data Mining System. In: Kwiecień, A., Gaj, P., Stera, P. (eds) Computer Networks. CN 2014. Communications in Computer and Information Science, vol 431. Springer, Cham. https://doi.org/10.1007/978-3-319-07941-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07941-7_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07940-0

  • Online ISBN: 978-3-319-07941-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics