On the role of application and resource characterizations in heterogeneous distributed computing systems

Abstract

Loosely coupled applications composed of a potentially very large number (from tens of thousands to even billions) of tasks are commonly used in high-throughput computing and many-task computing paradigms. To efficiently execute large-scale computations which can exceed the capability in a single type of computing resources within expected time, we should be able to effectively integrate resources from heterogeneous distributed computing (HDC) systems such as clusters, grids, and clouds. In this paper, we quantitatively analyze the performance of three different real scientific applications consisting of many tasks on top of HDC systems based on a partnership of distributed computing clusters, grids, and clouds to understand the application and resource characteristics, and show practical issues that normal scientific users can face during the course of leveraging these systems. Our experimental study shows that the performance of a loosely coupled application can be significantly affected by the characteristics of a HDC system, along with hardware specification of a node, and their impacts on the performance can vary widely depending on the resource usage pattern of each application. We then devise a preference-based scheduling algorithm that can reflect characteristics and resource usage patterns of various loosely coupled applications running on top of HDC systems from our experimental study. Our preference-based scheduling algorithm can allocate the resources from different HDC systems to loosely coupled applications based on the preferences of the applications for the HDC systems. We evaluate the overall system performance over various preference types, using trace-based simulations, which can be determined based on different factors such as CPU specifications and application throughputs. Our simulation results demonstrate the importance of understanding the application and resource characteristics on effective scheduling of loosely coupled applications on the HDC systems.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

References

  1. 1.

    Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Proceedings of IEEE Workshop on Many-Task Computing on Grids and Supercomputers, MTAGS (2008)

  2. 2.

    Kim, J.S., Rho, S., Kim, S., Kim, S., Kim, S., Hwang, S.: HTCaaS: leveraging distributed supercomputing infrastructures for large-scale scientific computing. In: Proceedings of 6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers, MTAGS (2013)

  3. 3.

    Wang, K., Ma, Z., Raicu, I.: Modeling many-task computing workloads on a petaflop IBM blue gene/P supercomputer. In: Proceedings of IEEE 27th International Parallel and Distributed Processing Symposium Workshops PhD Forum, IPDPSW, pp. 2111–2120 (2013)

  4. 4.

    Iosup, A., Ostermann, S., Yigitbasi, M.N., Prodan, R., Fahringer, T., Epema, D.H.: Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans. Parallel Distrib. Syst. 22(6), 931–945 (2011)

    Article  Google Scholar 

  5. 5.

    Moreno-Vozmediano, R., Montero, R.S., Llorente, I.M.: Multicloud deployment of computing clusters for loosely coupled MTC applications. IEEE Trans. Parallel Distrib. Syst. 22(6), 924–930 (2011)

    Article  Google Scholar 

  6. 6.

    Armstrong, T.G., Zhang, Z., Katz, D.S., Wilde, M., Foster, I.T.: Scheduling many-task workloads on supercomputers: dealing with trailing tasks. In: Proceedings of IEEE Workshop on Many-Task Computing on Grids and Supercomputers, MTAGS (2010)

  7. 7.

    Zhang, Z., Katz, D.S., Wilde, M., Wozniak, J.M., Foster, I.: MTC envelope: defining the capability of large scale computers in the context of parallel scripting applications. In: Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing, HPDC, pp. 37–48 (2013)

  8. 8.

    Hwang, E., Kim, S., Yoo, T., Kim, J., Hwang, S., Choi, Y.: Performance analysis of loosely coupled applications in heterogeneous distributed computing systems. In: Proceedings of The 3rd International Workshop on Autonomic Management of high performance Grid and Cloud Computing, in conjunction with 2015 International Conference on Cloud and Autonomic Computing, AMGCC, pp. 252–259 (2015)

  9. 9.

    PLSI. http://www.plsi.or.kr/ (2016)

  10. 10.

    Korea Institute of Science and Technology Information. http://en.kisti.re.kr/ (2016)

  11. 11.

    XSEDE: Extreme science and engineering discovery environment. http://www.xsede.org/ (2016)

  12. 12.

    PRACE: partnership for advanced computing in Europe. http://www.prace-ri.eu/ (2016)

  13. 13.

    Schmuck, F., Haskin, R.: GPFS: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST (2002)

  14. 14.

    LoadLeveler. http://www-03.ibm.com/systems/power/software/loadleveler/ (2016)

  15. 15.

    Cardenas, Y.: France–Asia virtual organization: current status. In: FJPPL Workshop (2012)

  16. 16.

    TORQUE Resource Manager. http://www.adaptivecomputing.com/products/open-source/torque/ (2016)

  17. 17.

    AutoDock. http://autodock.scripps.edu/ (2016)

  18. 18.

    Montage. http://montage.ipac.caltech.edu/ (2016)

  19. 19.

    Ryu, H.Y., Titov, A.I., Hosaka, A., Kim, H.C.: \(\phi \) photoproduction with coupled-channel effects. Prog. Theor. Exp. Phys. 2014(2), 023D03 (2014)

  20. 20.

    OpenNebula. http://www.opennebula.org/ (2016)

  21. 21.

    VMware vSphere. http://www.vmware.com (2016)

  22. 22.

    Understanding Clones. https://www.vmware.com/support/ws55/doc/ws_clone_overview.html (2016)

  23. 23.

    Wu, X., Shen, Z., Wu, R., Lin, Y.: Jump-start cloud: efficient deployment framework for large-scale cloud applications. Concurr. Comput. 24(17), 2120–2137 (2012)

    Article  Google Scholar 

  24. 24.

    Zhu, J., Jiang, Z., Xiao, Z.: Twinkle: A fast resource provisioning mechanism for internet services. In: Proceedings of IEEE INFOCOM, pp. 802–810 (2011)

  25. 25.

    Configuration maximums, VMware vSphere 5.1. https://www.vmware.com/pdf/vsphere5/r51/vsphere-51-configuration-maximums.pdf (2016)

  26. 26.

    Soundararajan, V., Anderson, J.M.: The impact of management operations on the virtualized datacenter. In: Proceedings of the 37th Annual International Symposium on Computer Architecture, ISCA, pp. 326–337 (2010)

  27. 27.

    Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: A performance analysis of EC2 cloud computing services for scientific computing. In: Proceedings of IEEE 3rd International Conference on Cloud Computing, CLOUD, pp. 115–131 (2010)

  28. 28.

    Hwang, E., Kim, S., Yoo, T., Kim, J., Hwang, S., Choi, Y.: Resource allocation policies for loosely coupled applications in heterogeneous computing systems. IEEE Trans. Parallel Distrib. Syst. 27(8), 2349–2362 (2016)

    Article  Google Scholar 

  29. 29.

    Raicu, I., Zhao, Y., Dumitrescu, C., Foster, I., Wilde, M.: Falkon: a fast and light-weight task execution framework. In: Proceedings of the ACM/IEEE conference on Supercomputing (2007)

  30. 30.

    Hategan, M., Wozniak, J., Maheshwari, K.: Coasters: uniform resource provisioning and access for clouds and grids. In: Proceedings of Fourth IEEE International Conference on Utility and Cloud Computing, UCC, pp. 114–121 (2011)

  31. 31.

    Fernández-Quiruelas, V., Blanco, C., Cofiño, A., Fernández, J.: Large-scale climate simulations harnessing clusters, grid and cloud infrastructures. Future Gener. Comput. Syst. 51, 36–44 (2015)

    Article  Google Scholar 

  32. 32.

    Tembey, P., Gavrilovska, A., Schwan, K.: Merlin: application- and platform-aware resource allocation in consolidated server systems. In: Proceedings of the ACM Symposium on Cloud Computing, SOCC (2014)

  33. 33.

    Lee, G., Chun, B.G., Katz, R.H.: Heterogeneity-aware resource allocation and scheduling in the cloud. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing, HotCloud (2011)

  34. 34.

    Nanduri, R., Maheshwari, N., Reddyraja, A., Varma, V.: Job aware scheduling algorithm for mapreduce framework. In: Proceedings of IEEE Third International Conference on Cloud Computing Technology and Science, CloudCom, pp. 724–729 (2011)

  35. 35.

    Braun, T.D., Siegel, H.J., Beck, N., Bölöni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., Freund, R.F.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)

  36. 36.

    Diaz, C.O., Pecero, J.E., Bouvry, P.: Scalable, low complexity, and fast greedy scheduling heuristics for highly heterogeneous distributed computing systems. J. Supercomput. 67(3), 837–853 (2014)

    Article  Google Scholar 

  37. 37.

    Maheswaran, M., Ali, S., Siegal, H., Hensgen, D., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings of Eighth Heterogeneous Computing Workshop, HCW, pp. 30–44 (1999)

  38. 38.

    Xiao, J., Zhang, Y., Chen, S., Yu, H.: An application-level scheduling with task bundling approach for many-task computing in heterogeneous environments. In: Proceedings of IFIP International Conference on Network and Parallel Computing, NPC (2012)

Download references

Acknowledgments

This work was partly supported by Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No. R0190-16-2012, High Performance Big Data Analytics Platform Performance Acceleration Technologies Development) and the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2015R1C1A1A02037400).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Young-ri Choi.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hwang, E., Kim, S., Kim, J. et al. On the role of application and resource characterizations in heterogeneous distributed computing systems. Cluster Comput 19, 2225–2240 (2016). https://doi.org/10.1007/s10586-016-0638-x

Download citation

Keywords

  • Loosely coupled applications
  • High-throughput computing
  • Many-task computing
  • Performance evaluation
  • Heterogeneous distributed computing systems
  • Scheduling algorithm