Journal of Grid Computing

, Volume 11, Issue 3, pp 585–612

Computer-Assisted Scientific Workflow Design

  • Nadia Cerezo
  • Johan Montagnat
  • Mireille Blay-Fornarino
Article

Abstract

Workflows are increasingly adopted to describe large-scale data- and compute-intensive processes that can take advantage of today’s Distributed Computing Infrastructures. Still, most Scientific Workflow formalisms are notoriously difficult to fully exploit, as they entangle the description of scientific processes and their implementation, blurring the lines between what is done and how it is done as well as between what is and what is not infrastructure-dependent. This work addresses the problem of data-intensive Scientific Workflow design by describing scientific experiments at a higher level of abstraction, emphasizing scientific concepts over technicalities, easing the separation of functional and non-functional concerns and leveraging domain knowledge. To achieve this goal, we propose a model-driven approach enhanced with Knowledge Engineering technologies. The main contributions of this work are a semantic Scientific Workflow model to capture user goals and a generative process assisting the transformation from high-level models to executable workflow artefacts.

Keywords

Scientific workflow design Workflow modeling Workflow composition Semantic workflow 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barga, R., Gannon, D.: Scientific versus business workflows. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, chapter 2, pp. 9–16. Springer (2007)Google Scholar
  2. 2.
    Görlach, K., Sonntag, M., Karastoyanova, D., Leymann, F., Reiter, M.: Conventional workflow technology for scientific simulation. In: Guide to e-Science, pp. 323–352. Springer, London (2011)CrossRefGoogle Scholar
  3. 3.
    Sonntag, M., Karastoyanova, D., Leymann, F.: The missing features of workflow systems for scientific computations. In: Proceedings of the 3rd Grid Workflow Workshop (GWW), pp. 209–216. Gesellschaft für Informatik, Paderborn, Germany (2010)Google Scholar
  4. 4.
    Montagnat, J., Isnard, B., Glatard, T., Maheshwari, K., Blay-Fornarino, M.: A data-driven workflow language for Grids based on array programming principles. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’09), pp. 1–10. ACM, Portland (2009)CrossRefGoogle Scholar
  5. 5.
    Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics J. 17(20), 3045–3054 (2004)CrossRefGoogle Scholar
  6. 6.
    Deelman, E., Singh, G., Su, M.-H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13(3), 219–237 (2005)Google Scholar
  7. 7.
    Kacsuk, P., Sipos, G.: Multi-Grid, Multi-User Workflows in the P-GRADE Grid Portal. J. Grid Computing (JOGC) 3(3–4), 221–238 (2005)CrossRefGoogle Scholar
  8. 8.
    Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput.: Pract. Exper. (CCPE) 18(10), 1039–1065 (2006)CrossRefGoogle Scholar
  9. 9.
    Glatard, T., Montagnat, J., Lingrand, D., Pennec, X.: Flexible and efficient workflow deployment of data-intensive applications on Grids with MOTEUR. Int. J. High Perform. Comput. Appl. (IJHPCA) (Special Issue on Workflows Systems in Grid Environments) 22(3), 347–360 (2008)CrossRefGoogle Scholar
  10. 10.
    Zhao, Y., Hategan, M., Clifford, B., Foster, I., von Laszewski, G., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: fast, reliable, loosely coupled parallel computation. In: IEEE International Workshop on Scientific Workflows, Salt Lake City, Utah, USA (2007)Google Scholar
  11. 11.
    Maheshwari, K., Montagnat, J.: Scientific workflows development using both visual-programming and scripted representations. In: International Workshop on Scientific Workflows (SWF’10), IEEE, Miami (2010)Google Scholar
  12. 12.
    Guan, Z., Hernández, F., Bangalore, P., Gray, J., Skjellum, A., Velusamy, V., Liu, Y.: Grid-Flow: a Grid-enabled scientific workflow system with a Petri-net-based interface. Concurr. Comput.: Pract. Exper. (CCPE) 18(10), 1115–1140 (2006)CrossRefGoogle Scholar
  13. 13.
    Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 47–56, ACM, New York (2011)CrossRefGoogle Scholar
  14. 14.
    Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: IEEE International Conference on Services Computing (SCC), pp. 449–456 (2010)Google Scholar
  15. 15.
    Gil, Y., Deelman, E., Ellisman, M.H., Fahringer, T., Fox, G., Gannon, D., Goble, C., Livny, M., Moreau, L., Myers, J.: Examining the challenges of scientific workflows. Computer 40, 24–32 (2007)CrossRefGoogle Scholar
  16. 16.
    McPhillips, T.M., Bowers, S., Zinn, D., Ludäscher, B.: Scientific workflow design for mere mortals. Futur. Gener. Comput. Syst. (FGCS) 25(5), 541–551 (2009)CrossRefGoogle Scholar
  17. 17.
    Bowers, S.: Scientific workflow, provenance, and data modeling challenges and approaches. J. Data Semantics 1(1), 19–30 (2012)CrossRefGoogle Scholar
  18. 18.
    Yu, J., Buyya, R.: A taxonomy of scientific workflow systems for Grid computing. ACM SIGMOD Rec. (SIGMOD) 34(3), 44–49 (2005)CrossRefGoogle Scholar
  19. 19.
    Singh, Y., Sood, M.: The impact of the computational independent model for enterprise information system development. Int. J. Comput. Appl. 11(8), 24–28 (2010)Google Scholar
  20. 20.
    Cerezo, N., Montagnat, J.: Scientific workflow reuse through conceptual workflows. In: 6th Workshop on Workflows in Support of Large-Scale Science (WORKS’11). ACM, Seattle (2011)Google Scholar
  21. 21.
    Marion, A., Forestier, G., Liebgott, H., Lartizien, C., Benoit-Cattin, H., Camarasu-Pop, S., Glatard, T., Ferreira Da Silva, R., Clarysse, P., Valette, S., Gibaud, B., Hugonnard, P., Tabary, J., Friboulet, D.: Multi-modality image simulation of biological models within VIP. In: 24th International Symposium on Computer-Based Medical Systems (CBMS), pp. 1–6. Bristol, UK (2011)Google Scholar
  22. 22.
    Fahringer, T., Prodan, R., Duan, R., Nerieri, F., Podlipnig, S., Qin, J., Siddiqui, M., Truong, H.-L., Villazon, A., Wieczorek, M.: ASKALON: a Grid application development and computing environment. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing (GRID), pp. 122–131. IEEE Computer Society, Washington, DC (2005)Google Scholar
  23. 23.
    Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucl. Instrum. Methods Phys. Res. A 34(Web Server issue), 729–732 (2006)Google Scholar
  24. 24.
    Taylor, I., Shields, M., Wang, I., Harrison, A.: The Triana Workflow environment: architecture and applications. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, chapter 20, pp. 320–339. Springer (2007)Google Scholar
  25. 25.
    Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, reloaded. In: SSDBM 2010. Heidelberg, Germany (2010)Google Scholar
  26. 26.
    Goderis, A., Sattler, U., Lord, P., Goble, C.: Seven bottlenecks to workflow reuse and repurposing. In: The Semantic Web? ISWC 2005 (LNCS), pp. 323–337. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  27. 27.
    Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: abstraction and reuse of object-oriented design. Med. Image Anal. (MedIA) 707, 406–431 (1993)Google Scholar
  28. 28.
    Gil, Y., Gonzales-Calero, P.A., Kim, J., Moody, J., Ratnakar, V.: A semantic framework for automatic generation of computational workflows using distributed data and component catalogues. J. Exp. Theoret. Artif. Intell. 23(4), 389–467 (2011)CrossRefGoogle Scholar
  29. 29.
    Gil, Y., Ratnakar, V., Jihie, K., Moody, J., Deelman, E., Gonzales-Calero, P.A., Groth, P.: Wings: intelligent workflow-based design of computational experiments. IEEE Intell. Syst. 26(1), 62–72 (2011)Google Scholar
  30. 30.
    Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.-M., Irwin, J.: Aspect-oriented programming. In: Proceedings of the European Conference on Object-Oriented Programming, vol. 1241, pp. 220–242 (1997)Google Scholar
  31. 31.
    Ionescu, T.B., Piater, A., Scheuermann, W., Laurien, E.: An aspect-oriented approach for the development of complex simulation software. J. Object Technol. (ETH Zurich) 9(1), 161–181 (2010)CrossRefGoogle Scholar
  32. 32.
    Schumm, D., Karastoyanova, D., Kopp, O., Leymann, F., Sonntag, M., Strauch, S.: Process fragment libraries for easier and faster development of process-based applications. J. Syst. Integr. 2(1), 39–55 (2011)Google Scholar
  33. 33.
    Schumm, D., Dentsas, D., Hahn, M., Karastoyanova, D., Leymann, F., Sonntag, M.: Web service composition reuse through shared process fragment libraries. In: Web Engineering (ICWE). LNCS, vol. 7387, pp. 498–501. Springer, Berlin (2012)CrossRefGoogle Scholar
  34. 34.
    Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L.: Sweetening Ontologies with DOLCE. In: Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web (LNCS), pp. 166–181. Springer, Berlin (2002)CrossRefGoogle Scholar
  35. 35.
    Forestier, G., Gibaud, B.: Semantic models in VIP. Technical report, INSERM, Rennes, France (2010)Google Scholar
  36. 36.
    Hauder, M., Gil, Y., Liu, Y., Sethi, R., Jo, H.: Making data analysis expertise broadly accessible through workflows. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 77–86. ACM, New York (2011)CrossRefGoogle Scholar
  37. 37.
    De Roure, D., Goble, C., Stevens, R.: The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Futur. Gener. Comput. Syst. (FGCS) 25, 561–567 (2008)CrossRefGoogle Scholar
  38. 38.
    Belhajjame, K., Embury, S.M., Paton, N.W., Stevens, R., Goble, C.: Automatic annotation of Web services based on workflow definitions. ACM Trans. Web 2(2), 11:1–11:34 (2008)CrossRefGoogle Scholar
  39. 39.
    Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P., Stevens, R., Goble, C.: The myGrid ontology: bioinformatics service discovery. Int. J. Bioinforma. Res. Appl. (IJBRA) 3, 303–325 (2007)CrossRefGoogle Scholar
  40. 40.
    Belhajjame, K., Wolstencroft, K., Corcho, O., Oinn, T., Tanoh, F., William, A., Goble, C.: Metadata management in the Taverna workflow system. In: 8th IEEE International Symposium on Cluster Computing and the Grid (CCGRID), pp. 651–656 (2008)Google Scholar
  41. 41.
    Bechhofer, S., de Roure, D., Gamble, M., Goble, C., Buchan, I.: Research objects: towards exchange and reuse of digital knowledge. In: The Future of the Web for Collaborative Science (FWCS) (2010)Google Scholar
  42. 42.
    Missier, P., Wolstencroft, K., Tanoh, F., Bechhofer, S., Belhajjame, K., Pettifer, S., Goble, C.: Functional units: abstractions for web service annotations. In: 6th World Congress on Services (SERVICES), pp. 306–313 (2010)Google Scholar
  43. 43.
    Altintas, I., Birnbaum, A., Baldridge, K., Sudholt, W., Miller, M., Amoreira, C., Potier, Y., Ludäscher, B.: A framework for the design and reuse of Grid workflows. In: Scientific Applications of Grid Computing (LNCS), pp. 295–299. Springer (2005)Google Scholar
  44. 44.
    Bowers, S., Ludäscher, B., Ngu, A.H.H., Critchlow, T.: Enabling scientific workflow reuse through structured composition of dataflow and control-flow. In: IEEE Workshop on Workflow and Data Flow for Scientific Applications (SciFlow). Atlanta, USA (2006)Google Scholar
  45. 45.
    Chin, G., Sivaramakrishnan, C., Critchlow, T., Schuchardt, K., Ngu, A.H.H.: Scientist-centered workflow abstractions via generic actors, workflow templates, and context-awareness for groundwater modeling and analysis. In: IEEE World Congress on Services (SERVICES), pp. 176–183 (2011)Google Scholar
  46. 46.
    Sonntag, M., Görlach, K., Karastoyanova, D., Leymann, F., Malets, P., Schumm, D.: Views on scientific workflows. In: Perspectives in Business Informatics Research(LNBIP), pp. 321–335. Springer, Berlin (2011)CrossRefGoogle Scholar
  47. 47.
    Plankensteiner, K., Montagnat, J., Prodan, R.: IWIR: a language enabling portability across Grid workflow systems. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’11). Seattle, USA (2011)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Nadia Cerezo
    • 1
  • Johan Montagnat
    • 1
  • Mireille Blay-Fornarino
    • 1
  1. 1.I3S LaboratoryUniversité Nice Sophia Antipolis/CNRSSophia AntipolisFrance

Personalised recommendations