Cloud Infrastructure Automation for Scientific Workflows

  • Bartosz BalisEmail author
  • Michal Orzechowski
  • Krystian Pawlik
  • Maciej Pawlik
  • Maciej Malawski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12043)


We present a solution for cloud infrastructure automation for scientific workflows. Unlike existing approaches, our solution is based on widely adopted tools, such as Terraform, and achieves a strict separation of two concerns: infrastructure description and provisioning vs. workflow description. At the same time it enables a comprehensive integration with a given cloud infrastructure, i.e. such wherein workflow execution can be managed by the cloud. The solution is integrated with our HyperFlow workflow management system and evaluated by demonstrating its use in experiments related to auto-scaling of scientific workflows in two types of cloud infrastructures: containerized Infrastructure-as-a-Service (IaaS) and Function-as-a-Service (FaaS). Experimental evaluation involves deployment and execution of a test workflow in Amazon ECS/Docker cluster and on a hybrid of Amazon ECS and AWS Lambda. The results show that our solution not only helps in the creation of repeatable infrastructures for scientific computing but also greatly facilitates automation of research experiments related to the execution of scientific workflows on advanced computing infrastructures.


Scientific workflows Infrastructure automation Autoscaling 



This work was supported by the National Science Centre, Poland, grant 2016/21/B/ST6/01497.


  1. 1.
    Azarnoosh, S., et al.: Introducing PRECIP: an API for managing repeatable experiments in the cloud. In: 2013 IEEE 5th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 19–26. IEEE (2013)Google Scholar
  2. 2.
    Balis, B., Figiela, K., Malawski, M., Pawlik, M., Bubak, M.: A lightweight approach for deployment of scientific workflows in cloud infrastructures. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 281–290. Springer, Cham (2016). Scholar
  3. 3.
    Balis, B.: Hyperflow: a model of computation, programming approach and enactment engine for complex distributed workflows. Future Gener. Comput. Syst. 55, 147–162 (2016)CrossRefGoogle Scholar
  4. 4.
    Berriman, G.B., Deelman, E., et al.: Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand. In: Astronomical Telescopes and Instrumentation, pp. 221–232. International Society for Optics and Photonics (2004)Google Scholar
  5. 5.
    Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009)CrossRefGoogle Scholar
  6. 6.
    Deelman, E., et al.: Pegasus, a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2014)CrossRefGoogle Scholar
  7. 7.
    Kacsuk, P., Kecskemeti, G., Kertesz, A., Nemeth, Z., Visegradi, A., Gergely, M.: Infrastructure aware scientific workflows and their support by a science gateway. In: 7th International Workshop on Science Gateways (IWSG), pp. 22–27. IEEE (2015)Google Scholar
  8. 8.
    Malawski, M., Gajek, A., Zima, A., Balis, B., Figiela, K.: Serverless execution of scientific workflows: experiments with HyperFlow, AWS Lambda and Google Cloud Functions. Future Gener. Comput. Syst. (2017, in Press)Google Scholar
  9. 9.
    Morris, K.: Infrastructure as Code: Managing Servers in the Cloud. O’Reilly Media Inc., Newton (2016)Google Scholar
  10. 10.
    Posey, B., Gropp, C., Herzog, A., Apon, A.: Automated cluster provisioning and workflow management for parallel scientific applications in the cloud. In: Proceedings 10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) (2017)Google Scholar
  11. 11.
    Santana-Perez, I., Pérez-Hernández, M.S.: Towards reproducibility in scientific workflows: an infrastructure-based approach. Sci. Program. (2015)Google Scholar
  12. 12.
    Varghese, B., Buyya, R.: Next generation cloud computing: new trends and research directions. Future Gener. Comput. Syst. 79, 849–861 (2018)CrossRefGoogle Scholar
  13. 13.
    Wang, J., AbdelBaky, M., Diaz-Montes, J., Purawat, S., Parashar, M., Altintas, I.: Kepler+ cometcloud: dynamic scientific workflow execution on federated cloud resources. Procedia Comput. Sci. 80, 700–711 (2016)CrossRefGoogle Scholar
  14. 14.
    Wang, J., Altintas, I.: Early cloud experiences with the kepler scientific workflow system. Procedia Comput. Sci. 9, 1630–1634 (2012)CrossRefGoogle Scholar
  15. 15.
    Wilde, M., Hategan, M., Wozniak, J.M., Clifford, B., Katz, D.S., Foster, I.T.: Swift: a language for distributed parallel scripting. Parallel Comput. 37(9), 633–652 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Bartosz Balis
    • 1
    Email author
  • Michal Orzechowski
    • 1
  • Krystian Pawlik
    • 1
  • Maciej Pawlik
    • 1
  • Maciej Malawski
    • 1
  1. 1.AGH Universtity of Science and TechnologyKrakowPoland

Personalised recommendations