Abstract
Cloud computing is an emerging technology and is being widely considered for resource utilization in various research areas. One of the main advantages of cloud computing is its flexibility in computing resource allocations. Many computing cycles can be ready in very short time and can be smoothly reallocated between tasks. Because of this, there are many private companies entering the new business of reselling their idle computing cycles. Research institutes have also started building their own cloud systems for their various research purposes. In this paper, we introduce a framework for virtual cluster system called vcluster which is capable of utilizing computing resources from heterogeneous clouds and provides a uniform view in computing resource management. vcluster is an IaaS (Infrastructure as a Service) based cloud resource management system. It distributes batch jobs to multiple clouds depending on the status of queue and system pool. The main design philosophy behind vcluster is cloud and batch system agnostic and it is achieved through plugins. This feature mitigates the complexity of integrating heterogeneous clouds. In the pilot system development, we use FermiCloud and Amazon EC2, which are a private and a public cloud system, respectively. In this paper, we also discuss the features and functionalities that must be considered in virtual cluster systems.
Similar content being viewed by others
Notes
In some applications like SAP, it shows up even better performance than bare metal. Despite the small overhead for virtualization, flexibility in computing resource allocation compensates for it and provides many advantages in resource management perspective.
Transmission delay of virtual machine images is one of challenges. According to [18], it took 5 hours to boot virtual machine of 16 GB transferred from Ottawa to Victoria.
The benchmark used in their evaluation is originally come from SPEC CPU2006 [22] and specific to HEP.
We do not consider multiple plugin supports at the same time. Therefore, if current batch plugin is for HTCondor, it has to be unplugged and replaced with a plugin for Torque in order to communicate with Torque batch system. This approach simplifies our implementation because it is unreasonable to support multiple batch systems in a single cluster system in HEP applications.
In our experiment system, we use HTCondor batch system, but not limited to a specific batch system.
vcluster GUI, managing a virtual cluster system, has been demonstrated at Supercomputing Conference in 2012.
Such modules are simplified version and including minimal functionalities to evaluate the concept of vcluster.
OpenNebula provides three different types of accesses to the cloud system such as OCA, OCCI (Open Cloud Computing Interface) [31], and Amazon EC2 [32]. In our implementation, we directly use OCA API because as of writing time, OpenNebula’s OCA API functions are a more complete set of functions to manage the cloud system compared to the OCCI and EC2 API’s.
In order to simplify our problem, we assume all virtual worker nodes have same virtual cores.
It can be achieved by virtual worker node migration technology to specific host systems.
References
Foster, I.T., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and grid computing 360-degree compared (2009). 0901.0131 [cs.DC]
Kim, J.-M., Jeong, H.-Y., Cho, I., Kang, S.M., Park, J.H.: A secure smart-work service model based OpenStack for cloud computing. Clust. Comput. 1–12 (2013). doi:10.1007/s10586-013-0251-1
Ma, X., Li, J., Zhang, F.: Outsourcing computation of modular exponentiations in cloud computing. Clust. Comput. 1–10 (2013). doi:10.1007/s10586-013-0252-0
Stockinger, H., Samar, A., Holtman, K., Allcock, B., Foster, I., Tierney, B.: File and object replication in data grids. Clust. Comput. 5(3), 305–314 (2002)
Iamnitchi, A., Doraimani, S., Garzoglio, G.: Workload characterization in a high-energy data grid and impact on resource management. Clust. Comput. 12(2), 153–173 (2009)
Kivity, A., Kamay, Y., Laor, D., Lublin, U., Liguori, A.: KVM: the Linux virtual machine monitor. In: Proceedings of the Linux Symposium, vol. 1, pp. 225–230 (2007)
KVM. http://www.linux-kvm.org/. May 2013
Xen. http://xen.org/. May 2013
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: SOSP, pp. 164–177 (2003)
Kessaci, Y., Melab, N., Talbi, E.-G.: A Pareto-based metaheuristic for scheduling HPC applications on a geographically distributed cloud federation. Clust. Comput. 16(3), 1–18 (2012)
Chen, H., Wu, S., Shi, X., Jin, H., Fu, Y.: LCM: a lightweight communication mechanism in HPC cloud. In: 2011 6th International Conference on Pervasive Computing and Applications (ICPCA), pp. 443–451 (2011)
Chadwick, K.: FermiGrid and FermiCloud update. In: 2012 International Symposium on Grids and Clouds, Taipei, Taiwan (2012)
RedHat. KVM—kernel based virtual machine: www.redhat.com/f/pdf/rhev/DOC-KVM.pdf. May 2013
FermiCloud. http://www-fermicloud.fnal.gov/. May 2013
Amazon EC2. http://aws.amazon.com/ec2/. May 2013
The open source toolkit for cloud system. http://opennebula.org/. May 2013
Wolinsky, D.I., Chuchaisri, P., Lee, K., Figueiredo, R.: Experiences with self-organizing, decentralized grids using the grid appliance. Clust. Comput. 16(2), 1–19 (2012)
Gable, I., Agarwal, A., Anderson, M., Armstrong, P., Charbonneau, A., Desmarais, R., Fransham, K., Harris, D., Impey, R., Leavett-Brown, C., Paterson, M., Penfold-Brown, D., Podaima, W., Sobie, R., Vliet, M.: A batch system for HEP applications on a distributed IaaS cloud. In: Proceedings of Computing in High Energy Physics 2010, Taipei, Taiwan, 2010
HTCondor. http://www.cs.wisc.edu/condor/. May 2013
Armstrong, P., Agarwal, A., Bishop, A., Charbonneau, A., Desmarais, R.J., Fransham, K., Hill, N., Gable, I., Gaudet, S., Goliath, S., Impey, R., Leavett-Brown, C., Ouellete, J., Paterson, M., Pritchet, C., Penfold-Brown, D., Podaima, W., Schade, D., Sobie, R.J.: Cloud scheduler: a resource manager for distributed compute clouds (2010). 1007.0050 [cs.DC]
Alef, M., Gable, I.: HEP specific benchmarks of virtual machines on multi-core CPU architectures. J. Phys. Conf. Ser. 219(5), 052015 (2010)
Henning, J.L.: SPEC CPU2000: measuring CPU performance in the new millennium. Computer 33(7), 28 (2000)
Youseff, L., Wolski, R., Gorda, B., Krintz, C.: Evaluating the performance impact of XEN on MPI and process execution for HPC systems. In: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, VTDC ’06, Washington, DC, USA, p. 1. IEEE Comput. Soc., Los Alamitos (2006)
Hussain, M., Abdulsalam, H.M.: Software quality in the clouds: a cloud-based solution. Clust. Comput. 1–14 (2012). doi:10.1007/s10586-012-0233-8
Assunção, M., Costanzo, A., Buyya, R.: A cost-benefit analysis of using cloud computing to extend the capacity of clusters. Clust. Comput. 13(3), 335–347 (2010)
Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B.P., Maechling, P.: Scientific workflow applications on Amazon EC2. In: 2009 5th IEEE International Conference on E-Science Workshops, Dec. 2009, pp. 59–66 (2009)
Raman, R., Livny, M., Solomon, M.: Matchmaking: an extensible framework for distributed resource management. Clust. Comput. 2(2), 129–138 (1999)
Jackson, M.: Moab and Torque achieve high utilization on flagship NERSC XT4 system. In: CUG 2008 Proceedings (2008)
Representational state transfer. http://en.wikipedia.org/wiki/Representational_State_Transfer. May 2013
Java OpenNebula Cloud API 3.0. http://opennebula.org/documentation:rel3.0:java. May 2013
Open Cloud Computing Interface. http://occi-wg.org/. May 2013
OpenNebula Cloud. http://opennebula.org/cloud:cloud. May 2013
Acknowledgement
This research was supported by the National Research Foundation (NRF) of Korea through contract N-13-NM-IR04.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Noh, SY., Timm, S.C. & Jang, H. vcluster: a framework for auto scalable virtual cluster system in heterogeneous clouds. Cluster Comput 17, 741–749 (2014). https://doi.org/10.1007/s10586-013-0292-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-013-0292-5