Skip to main content
Log in

Computer-Assisted Scientific Workflow Design

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Workflows are increasingly adopted to describe large-scale data- and compute-intensive processes that can take advantage of today’s Distributed Computing Infrastructures. Still, most Scientific Workflow formalisms are notoriously difficult to fully exploit, as they entangle the description of scientific processes and their implementation, blurring the lines between what is done and how it is done as well as between what is and what is not infrastructure-dependent. This work addresses the problem of data-intensive Scientific Workflow design by describing scientific experiments at a higher level of abstraction, emphasizing scientific concepts over technicalities, easing the separation of functional and non-functional concerns and leveraging domain knowledge. To achieve this goal, we propose a model-driven approach enhanced with Knowledge Engineering technologies. The main contributions of this work are a semantic Scientific Workflow model to capture user goals and a generative process assisting the transformation from high-level models to executable workflow artefacts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Barga, R., Gannon, D.: Scientific versus business workflows. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, chapter 2, pp. 9–16. Springer (2007)

  2. Görlach, K., Sonntag, M., Karastoyanova, D., Leymann, F., Reiter, M.: Conventional workflow technology for scientific simulation. In: Guide to e-Science, pp. 323–352. Springer, London (2011)

    Chapter  Google Scholar 

  3. Sonntag, M., Karastoyanova, D., Leymann, F.: The missing features of workflow systems for scientific computations. In: Proceedings of the 3rd Grid Workflow Workshop (GWW), pp. 209–216. Gesellschaft für Informatik, Paderborn, Germany (2010)

    Google Scholar 

  4. Montagnat, J., Isnard, B., Glatard, T., Maheshwari, K., Blay-Fornarino, M.: A data-driven workflow language for Grids based on array programming principles. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’09), pp. 1–10. ACM, Portland (2009)

    Chapter  Google Scholar 

  5. Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics J. 17(20), 3045–3054 (2004)

    Article  Google Scholar 

  6. Deelman, E., Singh, G., Su, M.-H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13(3), 219–237 (2005)

    Google Scholar 

  7. Kacsuk, P., Sipos, G.: Multi-Grid, Multi-User Workflows in the P-GRADE Grid Portal. J. Grid Computing (JOGC) 3(3–4), 221–238 (2005)

    Article  Google Scholar 

  8. Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput.: Pract. Exper. (CCPE) 18(10), 1039–1065 (2006)

    Article  Google Scholar 

  9. Glatard, T., Montagnat, J., Lingrand, D., Pennec, X.: Flexible and efficient workflow deployment of data-intensive applications on Grids with MOTEUR. Int. J. High Perform. Comput. Appl. (IJHPCA) (Special Issue on Workflows Systems in Grid Environments) 22(3), 347–360 (2008)

    Article  Google Scholar 

  10. Zhao, Y., Hategan, M., Clifford, B., Foster, I., von Laszewski, G., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: fast, reliable, loosely coupled parallel computation. In: IEEE International Workshop on Scientific Workflows, Salt Lake City, Utah, USA (2007)

  11. Maheshwari, K., Montagnat, J.: Scientific workflows development using both visual-programming and scripted representations. In: International Workshop on Scientific Workflows (SWF’10), IEEE, Miami (2010)

    Google Scholar 

  12. Guan, Z., Hernández, F., Bangalore, P., Gray, J., Skjellum, A., Velusamy, V., Liu, Y.: Grid-Flow: a Grid-enabled scientific workflow system with a Petri-net-based interface. Concurr. Comput.: Pract. Exper. (CCPE) 18(10), 1115–1140 (2006)

    Article  Google Scholar 

  13. Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 47–56, ACM, New York (2011)

    Chapter  Google Scholar 

  14. Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: IEEE International Conference on Services Computing (SCC), pp. 449–456 (2010)

  15. Gil, Y., Deelman, E., Ellisman, M.H., Fahringer, T., Fox, G., Gannon, D., Goble, C., Livny, M., Moreau, L., Myers, J.: Examining the challenges of scientific workflows. Computer 40, 24–32 (2007)

    Article  Google Scholar 

  16. McPhillips, T.M., Bowers, S., Zinn, D., Ludäscher, B.: Scientific workflow design for mere mortals. Futur. Gener. Comput. Syst. (FGCS) 25(5), 541–551 (2009)

    Article  Google Scholar 

  17. Bowers, S.: Scientific workflow, provenance, and data modeling challenges and approaches. J. Data Semantics 1(1), 19–30 (2012)

    Article  Google Scholar 

  18. Yu, J., Buyya, R.: A taxonomy of scientific workflow systems for Grid computing. ACM SIGMOD Rec. (SIGMOD) 34(3), 44–49 (2005)

    Article  Google Scholar 

  19. Singh, Y., Sood, M.: The impact of the computational independent model for enterprise information system development. Int. J. Comput. Appl. 11(8), 24–28 (2010)

    Google Scholar 

  20. Cerezo, N., Montagnat, J.: Scientific workflow reuse through conceptual workflows. In: 6th Workshop on Workflows in Support of Large-Scale Science (WORKS’11). ACM, Seattle (2011)

    Google Scholar 

  21. Marion, A., Forestier, G., Liebgott, H., Lartizien, C., Benoit-Cattin, H., Camarasu-Pop, S., Glatard, T., Ferreira Da Silva, R., Clarysse, P., Valette, S., Gibaud, B., Hugonnard, P., Tabary, J., Friboulet, D.: Multi-modality image simulation of biological models within VIP. In: 24th International Symposium on Computer-Based Medical Systems (CBMS), pp. 1–6. Bristol, UK (2011)

  22. Fahringer, T., Prodan, R., Duan, R., Nerieri, F., Podlipnig, S., Qin, J., Siddiqui, M., Truong, H.-L., Villazon, A., Wieczorek, M.: ASKALON: a Grid application development and computing environment. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing (GRID), pp. 122–131. IEEE Computer Society, Washington, DC (2005)

    Google Scholar 

  23. Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucl. Instrum. Methods Phys. Res. A 34(Web Server issue), 729–732 (2006)

    Google Scholar 

  24. Taylor, I., Shields, M., Wang, I., Harrison, A.: The Triana Workflow environment: architecture and applications. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, chapter 20, pp. 320–339. Springer (2007)

  25. Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, reloaded. In: SSDBM 2010. Heidelberg, Germany (2010)

    Google Scholar 

  26. Goderis, A., Sattler, U., Lord, P., Goble, C.: Seven bottlenecks to workflow reuse and repurposing. In: The Semantic Web? ISWC 2005 (LNCS), pp. 323–337. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  27. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: abstraction and reuse of object-oriented design. Med. Image Anal. (MedIA) 707, 406–431 (1993)

    Google Scholar 

  28. Gil, Y., Gonzales-Calero, P.A., Kim, J., Moody, J., Ratnakar, V.: A semantic framework for automatic generation of computational workflows using distributed data and component catalogues. J. Exp. Theoret. Artif. Intell. 23(4), 389–467 (2011)

    Article  Google Scholar 

  29. Gil, Y., Ratnakar, V., Jihie, K., Moody, J., Deelman, E., Gonzales-Calero, P.A., Groth, P.: Wings: intelligent workflow-based design of computational experiments. IEEE Intell. Syst. 26(1), 62–72 (2011)

    Google Scholar 

  30. Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.-M., Irwin, J.: Aspect-oriented programming. In: Proceedings of the European Conference on Object-Oriented Programming, vol. 1241, pp. 220–242 (1997)

  31. Ionescu, T.B., Piater, A., Scheuermann, W., Laurien, E.: An aspect-oriented approach for the development of complex simulation software. J. Object Technol. (ETH Zurich) 9(1), 161–181 (2010)

    Article  Google Scholar 

  32. Schumm, D., Karastoyanova, D., Kopp, O., Leymann, F., Sonntag, M., Strauch, S.: Process fragment libraries for easier and faster development of process-based applications. J. Syst. Integr. 2(1), 39–55 (2011)

    Google Scholar 

  33. Schumm, D., Dentsas, D., Hahn, M., Karastoyanova, D., Leymann, F., Sonntag, M.: Web service composition reuse through shared process fragment libraries. In: Web Engineering (ICWE). LNCS, vol. 7387, pp. 498–501. Springer, Berlin (2012)

    Chapter  Google Scholar 

  34. Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L.: Sweetening Ontologies with DOLCE. In: Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web (LNCS), pp. 166–181. Springer, Berlin (2002)

    Chapter  Google Scholar 

  35. Forestier, G., Gibaud, B.: Semantic models in VIP. Technical report, INSERM, Rennes, France (2010)

  36. Hauder, M., Gil, Y., Liu, Y., Sethi, R., Jo, H.: Making data analysis expertise broadly accessible through workflows. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 77–86. ACM, New York (2011)

    Chapter  Google Scholar 

  37. De Roure, D., Goble, C., Stevens, R.: The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Futur. Gener. Comput. Syst. (FGCS) 25, 561–567 (2008)

    Article  Google Scholar 

  38. Belhajjame, K., Embury, S.M., Paton, N.W., Stevens, R., Goble, C.: Automatic annotation of Web services based on workflow definitions. ACM Trans. Web 2(2), 11:1–11:34 (2008)

    Article  Google Scholar 

  39. Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P., Stevens, R., Goble, C.: The myGrid ontology: bioinformatics service discovery. Int. J. Bioinforma. Res. Appl. (IJBRA) 3, 303–325 (2007)

    Article  Google Scholar 

  40. Belhajjame, K., Wolstencroft, K., Corcho, O., Oinn, T., Tanoh, F., William, A., Goble, C.: Metadata management in the Taverna workflow system. In: 8th IEEE International Symposium on Cluster Computing and the Grid (CCGRID), pp. 651–656 (2008)

  41. Bechhofer, S., de Roure, D., Gamble, M., Goble, C., Buchan, I.: Research objects: towards exchange and reuse of digital knowledge. In: The Future of the Web for Collaborative Science (FWCS) (2010)

  42. Missier, P., Wolstencroft, K., Tanoh, F., Bechhofer, S., Belhajjame, K., Pettifer, S., Goble, C.: Functional units: abstractions for web service annotations. In: 6th World Congress on Services (SERVICES), pp. 306–313 (2010)

  43. Altintas, I., Birnbaum, A., Baldridge, K., Sudholt, W., Miller, M., Amoreira, C., Potier, Y., Ludäscher, B.: A framework for the design and reuse of Grid workflows. In: Scientific Applications of Grid Computing (LNCS), pp. 295–299. Springer (2005)

  44. Bowers, S., Ludäscher, B., Ngu, A.H.H., Critchlow, T.: Enabling scientific workflow reuse through structured composition of dataflow and control-flow. In: IEEE Workshop on Workflow and Data Flow for Scientific Applications (SciFlow). Atlanta, USA (2006)

  45. Chin, G., Sivaramakrishnan, C., Critchlow, T., Schuchardt, K., Ngu, A.H.H.: Scientist-centered workflow abstractions via generic actors, workflow templates, and context-awareness for groundwater modeling and analysis. In: IEEE World Congress on Services (SERVICES), pp. 176–183 (2011)

  46. Sonntag, M., Görlach, K., Karastoyanova, D., Leymann, F., Malets, P., Schumm, D.: Views on scientific workflows. In: Perspectives in Business Informatics Research(LNBIP), pp. 321–335. Springer, Berlin (2011)

    Chapter  Google Scholar 

  47. Plankensteiner, K., Montagnat, J., Prodan, R.: IWIR: a language enabling portability across Grid workflow systems. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’11). Seattle, USA (2011)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nadia Cerezo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cerezo, N., Montagnat, J. & Blay-Fornarino, M. Computer-Assisted Scientific Workflow Design. J Grid Computing 11, 585–612 (2013). https://doi.org/10.1007/s10723-013-9264-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-013-9264-5

Keywords

Navigation