Advertisement

HyperLoom Possibilities for Executing Scientific Workflows on the Cloud

  • Vojtech Cima
  • Stanislav Böhm
  • Jan Martinovič
  • Jiří Dvorský
  • Thomas J. Ashby
  • Vladimir Chupakhin
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 611)

Abstract

We have developed HyperLoom - a platform for defining and executing scientific workflows in large-scale HPC systems. The computational tasks in such workflows often have non-trivial dependency patterns, unknown execution time and unknown sizes of generated outputs. HyperLoom enables to efficiently execute the workflows respecting task requirements and cluster resources agnostically to the shape or size of the workflow. Although HPC infrastructures provide an unbeatable performance, they may be unavailable or too expensive especially for small to medium workloads. Moreover, for some workloads, due to HPCs not very flexible resource allocation policy, the system energy efficiency may not be optimal at some stages of the execution. In contrast, current public cloud providers such as Amazon, Google or Exoscale allow users a comfortable and elastic way of deploying, scaling and disposing a virtualized cluster of almost any size. In this paper, we describe HyperLoom virtualization and evaluate its performance in a virtualized environment using workflows of various shapes and sizes. Finally, we discuss the Hyperloom potential for its expansion to cloud environments.

Keywords

Cloud Virtualization Distributed environments Scientific workflows HPC 

Notes

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement No. 671555. This work was supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project IT4Innovations excellence in science - LQ1602 and by the IT4Innovations infrastructure which is supported from the Large Infrastructures for Research, Experimental Development and Innovations project IT4Innovations National Supercomputing Center LM2015070.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
    Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar
  8. 8.
    Chen, W., Deelman, E.: Workflow overhead analysis and optimizations. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, WORKS 2011, New York, NY, USA, pp. 11–20. ACM (2011)Google Scholar
  9. 9.
    Deelman, E., Singh, G., Mei-Hui, S., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., BruceBerriman, G., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. 13(3), 219–237 (2005)Google Scholar
  10. 10.
    Red Hat: Red hat enterprise linux (2017). https://www.redhat.com/en/technologies/linux-platforms/enterprise-linux. Accessed 31 Mar 2017
  11. 11.
    HTCondor: Htcondor (2017). https://research.cs.wisc.edu/htcondor/index.html. Accessed 31 Mar 2017
  12. 12.
    Lampa, S., Alvarsson, J., Spjuth, O.: Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles. J. Cheminformatics 8(1), 67 (2016)CrossRefGoogle Scholar
  13. 13.
    Rocklin, M.: Dask: parallel computation with blocked algorithms and task scheduling. In: Proceedings of the 14th Python in Science Conference, pp. 130–136. Citeseer (2015)Google Scholar
  14. 14.
    White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media Inc., Sebastopol (2009)Google Scholar
  15. 15.
    Wikipedia: Infiniband – wikipedia, the free encyclopedia (2017). https://en.wikipedia.org/w/index.php?title=InfiniBand&oldid=772443735. Accessed 31 Mar 2017
  16. 16.
    Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Vojtech Cima
    • 1
  • Stanislav Böhm
    • 1
  • Jan Martinovič
    • 1
  • Jiří Dvorský
    • 1
  • Thomas J. Ashby
    • 2
  • Vladimir Chupakhin
    • 3
  1. 1.IT4InnovationsVŠB Technical University of OstravaOstravaCzech Republic
  2. 2.IMECBrusselsBelgium
  3. 3.Janssen Pharmaceutica NVBrusselsBelgium

Personalised recommendations