Skip to main content
Log in

vcluster: a framework for auto scalable virtual cluster system in heterogeneous clouds

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Cloud computing is an emerging technology and is being widely considered for resource utilization in various research areas. One of the main advantages of cloud computing is its flexibility in computing resource allocations. Many computing cycles can be ready in very short time and can be smoothly reallocated between tasks. Because of this, there are many private companies entering the new business of reselling their idle computing cycles. Research institutes have also started building their own cloud systems for their various research purposes. In this paper, we introduce a framework for virtual cluster system called vcluster which is capable of utilizing computing resources from heterogeneous clouds and provides a uniform view in computing resource management. vcluster is an IaaS (Infrastructure as a Service) based cloud resource management system. It distributes batch jobs to multiple clouds depending on the status of queue and system pool. The main design philosophy behind vcluster is cloud and batch system agnostic and it is achieved through plugins. This feature mitigates the complexity of integrating heterogeneous clouds. In the pilot system development, we use FermiCloud and Amazon EC2, which are a private and a public cloud system, respectively. In this paper, we also discuss the features and functionalities that must be considered in virtual cluster systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1

Similar content being viewed by others

Notes

  1. There are many research groups that have been utilizing virtualization technology in HPC (High Performance Computing) [10, 11]. Fermilab also evaluated the MPI performance using virtual machines communicating via Infiniband [12] and found similar performance.

  2. In some applications like SAP, it shows up even better performance than bare metal. Despite the small overhead for virtualization, flexibility in computing resource allocation compensates for it and provides many advantages in resource management perspective.

  3. Transmission delay of virtual machine images is one of challenges. According to [18], it took 5 hours to boot virtual machine of 16 GB transferred from Ottawa to Victoria.

  4. The benchmark used in their evaluation is originally come from SPEC CPU2006 [22] and specific to HEP.

  5. We do not consider multiple plugin supports at the same time. Therefore, if current batch plugin is for HTCondor, it has to be unplugged and replaced with a plugin for Torque in order to communicate with Torque batch system. This approach simplifies our implementation because it is unreasonable to support multiple batch systems in a single cluster system in HEP applications.

  6. In our experiment system, we use HTCondor batch system, but not limited to a specific batch system.

  7. vcluster GUI, managing a virtual cluster system, has been demonstrated at Supercomputing Conference in 2012.

  8. Such modules are simplified version and including minimal functionalities to evaluate the concept of vcluster.

  9. OpenNebula provides three different types of accesses to the cloud system such as OCA, OCCI (Open Cloud Computing Interface) [31], and Amazon EC2 [32]. In our implementation, we directly use OCA API because as of writing time, OpenNebula’s OCA API functions are a more complete set of functions to manage the cloud system compared to the OCCI and EC2 API’s.

  10. In order to simplify our problem, we assume all virtual worker nodes have same virtual cores.

  11. It can be achieved by virtual worker node migration technology to specific host systems.

References

  1. Foster, I.T., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and grid computing 360-degree compared (2009). 0901.0131 [cs.DC]

  2. Kim, J.-M., Jeong, H.-Y., Cho, I., Kang, S.M., Park, J.H.: A secure smart-work service model based OpenStack for cloud computing. Clust. Comput. 1–12 (2013). doi:10.1007/s10586-013-0251-1

  3. Ma, X., Li, J., Zhang, F.: Outsourcing computation of modular exponentiations in cloud computing. Clust. Comput. 1–10 (2013). doi:10.1007/s10586-013-0252-0

  4. Stockinger, H., Samar, A., Holtman, K., Allcock, B., Foster, I., Tierney, B.: File and object replication in data grids. Clust. Comput. 5(3), 305–314 (2002)

    Article  Google Scholar 

  5. Iamnitchi, A., Doraimani, S., Garzoglio, G.: Workload characterization in a high-energy data grid and impact on resource management. Clust. Comput. 12(2), 153–173 (2009)

    Article  Google Scholar 

  6. Kivity, A., Kamay, Y., Laor, D., Lublin, U., Liguori, A.: KVM: the Linux virtual machine monitor. In: Proceedings of the Linux Symposium, vol. 1, pp. 225–230 (2007)

    Google Scholar 

  7. KVM. http://www.linux-kvm.org/. May 2013

  8. Xen. http://xen.org/. May 2013

  9. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: SOSP, pp. 164–177 (2003)

    Google Scholar 

  10. Kessaci, Y., Melab, N., Talbi, E.-G.: A Pareto-based metaheuristic for scheduling HPC applications on a geographically distributed cloud federation. Clust. Comput. 16(3), 1–18 (2012)

    Google Scholar 

  11. Chen, H., Wu, S., Shi, X., Jin, H., Fu, Y.: LCM: a lightweight communication mechanism in HPC cloud. In: 2011 6th International Conference on Pervasive Computing and Applications (ICPCA), pp. 443–451 (2011)

    Chapter  Google Scholar 

  12. Chadwick, K.: FermiGrid and FermiCloud update. In: 2012 International Symposium on Grids and Clouds, Taipei, Taiwan (2012)

    Google Scholar 

  13. RedHat. KVM—kernel based virtual machine: www.redhat.com/f/pdf/rhev/DOC-KVM.pdf. May 2013

  14. FermiCloud. http://www-fermicloud.fnal.gov/. May 2013

  15. Amazon EC2. http://aws.amazon.com/ec2/. May 2013

  16. The open source toolkit for cloud system. http://opennebula.org/. May 2013

  17. Wolinsky, D.I., Chuchaisri, P., Lee, K., Figueiredo, R.: Experiences with self-organizing, decentralized grids using the grid appliance. Clust. Comput. 16(2), 1–19 (2012)

    Google Scholar 

  18. Gable, I., Agarwal, A., Anderson, M., Armstrong, P., Charbonneau, A., Desmarais, R., Fransham, K., Harris, D., Impey, R., Leavett-Brown, C., Paterson, M., Penfold-Brown, D., Podaima, W., Sobie, R., Vliet, M.: A batch system for HEP applications on a distributed IaaS cloud. In: Proceedings of Computing in High Energy Physics 2010, Taipei, Taiwan, 2010

    Google Scholar 

  19. HTCondor. http://www.cs.wisc.edu/condor/. May 2013

  20. Armstrong, P., Agarwal, A., Bishop, A., Charbonneau, A., Desmarais, R.J., Fransham, K., Hill, N., Gable, I., Gaudet, S., Goliath, S., Impey, R., Leavett-Brown, C., Ouellete, J., Paterson, M., Pritchet, C., Penfold-Brown, D., Podaima, W., Schade, D., Sobie, R.J.: Cloud scheduler: a resource manager for distributed compute clouds (2010). 1007.0050 [cs.DC]

  21. Alef, M., Gable, I.: HEP specific benchmarks of virtual machines on multi-core CPU architectures. J. Phys. Conf. Ser. 219(5), 052015 (2010)

    Article  Google Scholar 

  22. Henning, J.L.: SPEC CPU2000: measuring CPU performance in the new millennium. Computer 33(7), 28 (2000)

    Article  Google Scholar 

  23. Youseff, L., Wolski, R., Gorda, B., Krintz, C.: Evaluating the performance impact of XEN on MPI and process execution for HPC systems. In: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, VTDC ’06, Washington, DC, USA, p. 1. IEEE Comput. Soc., Los Alamitos (2006)

    Google Scholar 

  24. Hussain, M., Abdulsalam, H.M.: Software quality in the clouds: a cloud-based solution. Clust. Comput. 1–14 (2012). doi:10.1007/s10586-012-0233-8

  25. Assunção, M., Costanzo, A., Buyya, R.: A cost-benefit analysis of using cloud computing to extend the capacity of clusters. Clust. Comput. 13(3), 335–347 (2010)

    Article  Google Scholar 

  26. Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B.P., Maechling, P.: Scientific workflow applications on Amazon EC2. In: 2009 5th IEEE International Conference on E-Science Workshops, Dec. 2009, pp. 59–66 (2009)

    Google Scholar 

  27. Raman, R., Livny, M., Solomon, M.: Matchmaking: an extensible framework for distributed resource management. Clust. Comput. 2(2), 129–138 (1999)

    Article  Google Scholar 

  28. Jackson, M.: Moab and Torque achieve high utilization on flagship NERSC XT4 system. In: CUG 2008 Proceedings (2008)

    Google Scholar 

  29. Representational state transfer. http://en.wikipedia.org/wiki/Representational_State_Transfer. May 2013

  30. Java OpenNebula Cloud API 3.0. http://opennebula.org/documentation:rel3.0:java. May 2013

  31. Open Cloud Computing Interface. http://occi-wg.org/. May 2013

  32. OpenNebula Cloud. http://opennebula.org/cloud:cloud. May 2013

Download references

Acknowledgement

This research was supported by the National Research Foundation (NRF) of Korea through contract N-13-NM-IR04.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haengjin Jang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Noh, SY., Timm, S.C. & Jang, H. vcluster: a framework for auto scalable virtual cluster system in heterogeneous clouds. Cluster Comput 17, 741–749 (2014). https://doi.org/10.1007/s10586-013-0292-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0292-5

Keywords

Navigation