Skip to main content

Scheduling Many-Task Applications on Multi-clouds and Hybrid Clouds

  • Conference paper
  • First Online:
Asynchronous Many-Task Systems and Applications (WAMTA 2023)

Abstract

A centralized scheduler can become a bottleneck for placing the tasks of a many-task application on heterogeneous cloud resources. We have previously demonstrated that a decentralized vector scheduling approach based on performance measurements can be used successfully for this task placement scenario. We then extended this approach to task placement based on latency measurements. Each node collects the performance measurements from its neighbors on an overlay graph, measures the communication latency, and then makes local decisions on where to move tasks. Our recent experiments in CloudLab with nodes allocated on multiple cloud sites demonstrate that using latency in our vector scheduling approach results in better performance and resource utilization. While our algorithm for configuring the overlay graph based on latency measurements was beneficial with simulated communication delays, it was not beneficial in the multi-cloud environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abramson, D., Giddy, J., Kotler, L.: High performance parametric modeling with Nimrod/G: killer application for the global grid? In: Proceedings of 14th International Parallel and Distributed Processing Symposium (IPDPS 2000), pp. 520–528 (2000). https://doi.org/10.1109/IPDPS.2000.846030

  2. Barsness, E.L., Darrington, D.L., Lucas, R.L., Santosuosso, J.M.: Distributed job scheduling in a multi-nodal environment. US Patent 8,645,745 (2014)

    Google Scholar 

  3. Baumgartner, G., et al.: Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models. Proc. IEEE 93, 276–292 (2005)

    Article  Google Scholar 

  4. Buaklee, D., Tracy, G., Vernon, M., Wright, S.: Near-optimal adaptive control of a large grid application. In: Proceedings of the 16th International Conference on Supercomputing, pp. 315–326 (2002)

    Google Scholar 

  5. Chakravarti, A.J., Baumgartner, G., Lauria, M.: The Organic Grid: self-organizing computation on a peer-to-peer network. IEEE Trans. Syst. Man Cybern.-Part A: Syst. Hum. 35(3), 373–384 (2005)

    Article  Google Scholar 

  6. Chakravarti, A.J., Baumgartner, G., Lauria, M.: Self-organizing scheduling on the Organic Grid. Int. J. High Perform. Comput. Appl. 20(1), 115–130 (2006)

    Article  Google Scholar 

  7. Chen, J., et al.: Beeflow: a workflow management system for in situ processing across HPC and cloud systems. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 1029–1038 (2018). https://doi.org/10.1109/ICDCS.2018.00103

  8. Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)

    Article  Google Scholar 

  9. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  10. Duplyakin, D., et al.: The design and operation of CloudLab. In: Proceedings of the USENIX Annual Technical Conference (ATC), pp. 1–14 (2019). https://www.flux.utah.edu/paper/duplyakin-atc19

  11. Evangelinos, C., Hill, C.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon EC2. Ratio 2(2.40), 2–34 (2008)

    Google Scholar 

  12. Gutierrez-Estevez, D.M., Luo, M.: Multi-resource schedulable unit for adaptive application-driven unified resource management in data centers. In: 2015 International Telecommunication Networks and Applications Conference (ITNAC), pp. 261–268. IEEE (2015)

    Google Scholar 

  13. Luo, M., Li, L., Chou, W.: ADARM: an application-driven adaptive resource management framework for data centers. In: 2017 IEEE International Conference on AI & Mobile Services, pp. 76–84 (2017)

    Google Scholar 

  14. Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput. 59(2), 107–131 (1999)

    Article  Google Scholar 

  15. Mithila, S.P.: Scheduling Many-Task Computing Applications for a Hybrid Cloud. LSU doctoral dissertation. 5928, Louisiana State University and Agricultural and Mechanical College (2022)

    Google Scholar 

  16. Mithila, S.P., Baumgartner, G.: Latency-based vector scheduling of many-task applications for a hybrid cloud. In: 2022 IEEE 15th International Conference on Cloud Computing (CLOUD), pp. 257–262 (2022). https://doi.org/10.1109/CLOUD55607.2022.00047

  17. Mohammadzadeh, A., Masdari, M., Gharehchopogh, F.S.: Energy and cost-aware workflow scheduling in cloud computing data centers using a multi-objective optimization algorithm. J. Netw. Syst. Manag. 29(3), 1–34 (2021)

    Article  Google Scholar 

  18. Peterson, B.: Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud. LSU doctoral dissertation. 4223, Louisiana State University and Agricultural and Mechanical College (2017)

    Google Scholar 

  19. Peterson, B., Fazlalizadeh, Y., Baumgartner, G., Wang, Q.: A vector-scheduling approach for running many-task applications in the cloud. In: Luo, M., Zhang, L.-J. (eds.) CLOUD 2018. LNCS, vol. 10967, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94295-7_1

    Chapter  Google Scholar 

  20. Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: 2008 Workshop on Many-Task Computing on Grids and Supercomputers, pp. 1–11. IEEE (2008)

    Google Scholar 

  21. Rajbhandari, S., Nikam, A., Lai, P., Stock, K., Krishnamoorthy, S., Sadayappan, P.: A communication-optimal framework for contracting distributed tensors. In: SC 2014: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 375–386. IEEE (2014)

    Google Scholar 

  22. Taylor, I., Shields, M., Wang, I.: Resource management for the triana peer-to-peer services. In: Nabrzyski, J., Schopf, J.M., Weglarz, J. (eds.) Grid Resource Management, pp. 451–462. Springer, Boston (2004). https://doi.org/10.1007/978-1-4615-0509-9_27

    Chapter  Google Scholar 

  23. Vannikkarasan, H.: Decentralized scheduling in cloud with variable size tasks. Technical report, Louisiana State University (2021)

    Google Scholar 

  24. Walker, E.: Benchmarking Amazon EC2 for high-performance scientific computing. Mag. USENIX SAGE 33(5), 18–23 (2008)

    Google Scholar 

  25. Wikipedia: Grid computing (2023). https://en.wikipedia.org/wiki/Grid_computing

  26. Xin, R., Gonzalez, J., Franklin, M., Stoica, I.: Graphx: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, pp. 1–6 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gerald Baumgartner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mithila, S.P., Franz, P., Baumgartner, G. (2023). Scheduling Many-Task Applications on Multi-clouds and Hybrid Clouds. In: Diehl, P., Thoman, P., Kaiser, H., Kale, L. (eds) Asynchronous Many-Task Systems and Applications. WAMTA 2023. Lecture Notes in Computer Science, vol 13861. Springer, Cham. https://doi.org/10.1007/978-3-031-32316-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-32316-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-32315-7

  • Online ISBN: 978-3-031-32316-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics