Journal of Grid Computing

, Volume 11, Issue 3, pp 457–480 | Cite as

Bundle and Pool Architecture for Multi-Language, Robust, Scalable Workflow Executions

  • David Rogers
  • Ian Harvey
  • Tram Truong Huu
  • Kieran Evans
  • Tristan Glatard
  • Ibrahim Kallel
  • Ian Taylor
  • Johan Montagnat
  • Andrew Jones
  • Andrew Harrison
Article

Abstract

In this paper, we leverage the previous work on the SHIWA bundling format and expand on this specification in order to facilitate workflow execution within a multi-workflow environment. We introduce a scalable and robust execution pool environment that supports workflows consisting of sub-workflows built upon a multitude of different workflow engines and environments, and also provide a common workflow representation for seamless connectivity through serialization to workflow bundles. We also present a meta-workflow scenario based upon this system. Workflow bundles employ the lightweight Open Archives Initiative Object Reuse and Exchange (ORE) Web-based standard, to provide a common format for representing and sharing workflows and the associated metadata required for their execution. This generalized bundling approach is already available within five workflow engines and has proven a useful environment for inter-workflow experimentation. The execution pool facilitates federated access to multiple distributed computing infrastructures supported by the underlying workflow engines subscribed to the pool. Workflow bundles are exposed using the eXtensible Messaging and Presence Protocol (XMPP), which provides the necessary communication backbone to enable multiple workflow engine agents to asynchronously publish and subscribe to bundles in meta-workflow pipelines. We present experiments showing the scalability and robustness of the pool execution approach with results showing that overheads remain controlled for up to 150 workflow agents, and that agent failures have very limited impact. We then demonstrate the applicability of our architecture by describing how a Java-based music analysis workflow can be distributed within such a multi-workflow environment consisting of the Triana and MOTEUR workflow engines.

Keywords

Scientific workflows Distributed computing infrastructure Grid computing Cloud computing Interoperability Data modelling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-Science: an overview of workflow system features and capabilities. Futur. Gener. Comput. Syst. 25(5), 528–540 (2009)CrossRefGoogle Scholar
  2. 2.
    Wieczorek, M., Prodan, R., Fahringer, T.: Scheduling of scientific workflows in the ASKALON Grid environment. SIGMOD Record 34(3), 56–62 (2005)CrossRefGoogle Scholar
  3. 3.
    Fahringer, T., Prodan, R., Duan, R., Hofer, J., Nadeem, F., Nerieri, F., Podlipnig, S., Qin, J., Siddiqui, M., Truong, H.-L., Villazon, A., Wieczorek, M.: ASKALON: a development and Grid computing environment for scientific workflows. In: Workflows for e-Science, pp. 450–471. Springer, New York (2007)CrossRefGoogle Scholar
  4. 4.
    Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: 16th International Conference on Scientific and Statistical Database Management (SSDBM), pp. 423–424. IEEE Computer Society, New York (2004)Google Scholar
  5. 5.
    Kacsuk, P.: P-grade portal family for Grid infrastructures. Concurr. Comput. Pract. Exper. 23, 235–245 (2011)CrossRefGoogle Scholar
  6. 6.
    Deelman, E., Singh, G., Su, M.-H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13(3), 219–237 (2005)Google Scholar
  7. 7.
    Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)CrossRefGoogle Scholar
  8. 8.
    Harrison, A., Taylor, I., Wang, I., Shields, M.: WS-RF workflow in Triana. Int. J. High Perform. Comput. Appl. 22(3), 268–283 (2008)CrossRefGoogle Scholar
  9. 9.
    Barga, R., Jackson, J., Araujo, N., Guo, D., Gautam, N., Simmhan, Y.: The trident scientific workflow workbench. In: Proceedings of the 2008 4th IEEE International Conference on e-Science, pp. 317–318. IEEE Computer Society, Washington (2008)Google Scholar
  10. 10.
    Glatard, T., Montagnat, J., Lingrand, D., Pennec, X.: Flexible and efficient workflow deployment of data-intensive applications on Grids with MOTEUR. Int. J. High Perform. Comput. Appl. 22, 347–360 (2008)CrossRefGoogle Scholar
  11. 11.
    The Sharing Interoperable Workflows for large-scale scientific simulations on Available DCIs Project. http://www.shiwa-workflow.eu/. Accessed 3 June 2012
  12. 12.
    Open Archives Initiative. Object, Reuse and Exchange (ORE). http://www.openarchives.org/ore/. Accessed 12 Sept 2011
  13. 13.
    Harrison, A., Harvey, I., Jones, A., Rogers, D., Taylor, I.: Object reuse and exchange for publishing and sharing workflows. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science at Supercomputing. Seattle (2011)Google Scholar
  14. 14.
    Extensible Messaging and Presence Protocol (XMPP): CoreGoogle Scholar
  15. 15.
    Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and PresenceGoogle Scholar
  16. 16.
    Korkhov, V., Krefting, D., Kukla, T., Terstyanszky, G., Caan, M., Olabarriaga, S.: Exploring workflow interoperability tools for neuroimaging data analysis. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science at Supercomputing. Seattle (2011)Google Scholar
  17. 17.
    Plankensteiner, K., Montagnat, J., Prodan, R.: IWIR: a language enabling portability across Grid workflow systems. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science at Supercomputing. Seattle (2011)Google Scholar
  18. 18.
    The Workflow Management Coalition. http://www.wfmc.org/. Accessed 18 Sept 2011
  19. 19.
    Harrison, A., Taylor, I.: Web enabling desktop workflow applications. In: Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science (2009)Google Scholar
  20. 20.
    The EU Wf4Ever Project. http://www.wf4ever-project.org/. Accessed 3 Sept 2011
  21. 21.
    Bechhofer, S., Ainsworth, J., Bhagat, J., Buchan, I., Couch, P., Cruickshank, D., Delderfield, M., Dunlop, I., Gamble, M., Goble, C., Michaelides, D., Missier, P., Owen, S., Newman, D., De Roure, D., Sufi, S.: Why linked data is not enough for scientists. In: Sixth IEEE e-Science Conference (e-Science 2010) (2010)Google Scholar
  22. 22.
    De Roure, D., Goble, C., Stevens, R.: Designing the myexperiment virtual research environment for the social sharing of workflows. In: Proceedings of the Third IEEE International Conference on e-Science and Grid Computing, pp. 603–610. IEEE Computer Society, Washington (2007)CrossRefGoogle Scholar
  23. 23.
    Peltier, S.T., Lin, A.W., Lee, D., Mock, S., Lamont, S., Molina, T., Wong, M., Martone, M.E., Ellisman, M.H.: The telescience portal for advanced tomography applications. J. Parallel Distrib. Appl. 63(5), 539–550 (2003)CrossRefGoogle Scholar
  24. 24.
    Groth, P., Deelman, E., Juve, G., Mehta, G., Berriman, B.: Pipeline-centric provenance model. In: Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, WORKS ’09, pp. 4:1–4:8. ACM, New York (2009)Google Scholar
  25. 25.
    Kertész, A., Sipos, G., Kacsuk, P.: Brokering multi-Grid workflows in the P-GRADE portal. In: Euro-Par 2006: Parallel Processing, vol. 4375, pp. 138–149. Springer, Berlin (2007)CrossRefGoogle Scholar
  26. 26.
    Zhao, Z., Booms, S., Belloum, A., de Laat, C., Hertzberger, B.: Vle-wfbus: a scientific workflow bus for multi e-science domains. In: Proceedings of the 2nd IEEE International Conference on e-Science and Grid Computing, pp. 11–19. IEEE Computer Society Press, Amsterdam (2006)Google Scholar
  27. 27.
    Ahmed, A., Ian, T., Andrew, J.: Scientific workflow interoperability framework. Int. J. Bus. Process. Integr. Manag. 5(1), 93–105 (2010)CrossRefGoogle Scholar
  28. 28.
    Taylor, I., Al-Shakarchi, E., Beck, S.D.: Distributed Audio Retrieval using Triana (DART). In: International Computer Music Conference (ICMC) 2006, pp. 716–722, November 6–11, Tulane University, USA (2006)Google Scholar
  29. 29.
    Gunter, D., Deelman, E., Samak, T., Brooks, C.X., Goode, M., Juve, G., Mehta, G., Moraes, P., Silva, F., Swany, D.M., Vahi, K.: Online workflow management and performance analysis with stampede. In: CNSM, pp. 1–10. IEEE (2011)Google Scholar
  30. 30.
    Samak, T., Gunter, D., Goode, M., Deelman, E., Juve, G., Mehta, G., Silva, F., Vahi, K.: Online fault and anomaly detection for large-scale scientific workflows. In: Thulasiraman, P., Yang, L.T., Pan, Q., Liu, X., Chen, Y.-C., Huang, Y.-P., Chang, L.H., Hung, C.-L., Lee, C.-R., Shi, J.Y., Zhang, Y. (eds.) HPCC, pp. 373–381. IEEE (2011)Google Scholar
  31. 31.
    Casajus, A., Graciani, R., Paterson, S., Tsaregorodtsev, A., The Lhcb Dirac Team: DIRAC pilot framework and the DIRAC workload management system. J. Phys. Conf. Ser. 219(1–6) (2010)Google Scholar
  32. 32.
    da Silva, R.F., Camarasu-Pop, S., Grenier, B., Hamar, V., Manset, D., Montagnat, J., Revillard, J., Balderrama, J.R., Tsaregorodtsev, A., Glatard, T.: Multi-infrastructure workflow execution for medical simulation in the virtual imaging platform. In: HealthGrid 2011. Bristol, UK (2011)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • David Rogers
    • 1
  • Ian Harvey
    • 1
  • Tram Truong Huu
    • 2
  • Kieran Evans
    • 1
  • Tristan Glatard
    • 3
  • Ibrahim Kallel
    • 3
  • Ian Taylor
    • 1
  • Johan Montagnat
    • 2
  • Andrew Jones
    • 1
  • Andrew Harrison
    • 4
  1. 1.School of Computer Science and InformaticsCardiff UniversityCardiffUK
  2. 2.I3S Laboratory, MODALIS TeamCNRS/UNSSophia AntipolisFrance
  3. 3.CREATIS; CNRS UMR5220, Inserm U1044; INSA-Lyon; Université Lyon 1, FranceUniversité de LyonLyonFrance
  4. 4.Healthcare SolutionsHarris CorporationScottsdaleUSA

Personalised recommendations