Abstract
Workflows are increasingly adopted to describe large-scale data- and compute-intensive processes that can take advantage of today’s Distributed Computing Infrastructures. Still, most Scientific Workflow formalisms are notoriously difficult to fully exploit, as they entangle the description of scientific processes and their implementation, blurring the lines between what is done and how it is done as well as between what is and what is not infrastructure-dependent. This work addresses the problem of data-intensive Scientific Workflow design by describing scientific experiments at a higher level of abstraction, emphasizing scientific concepts over technicalities, easing the separation of functional and non-functional concerns and leveraging domain knowledge. To achieve this goal, we propose a model-driven approach enhanced with Knowledge Engineering technologies. The main contributions of this work are a semantic Scientific Workflow model to capture user goals and a generative process assisting the transformation from high-level models to executable workflow artefacts.
Similar content being viewed by others
References
Barga, R., Gannon, D.: Scientific versus business workflows. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, chapter 2, pp. 9–16. Springer (2007)
Görlach, K., Sonntag, M., Karastoyanova, D., Leymann, F., Reiter, M.: Conventional workflow technology for scientific simulation. In: Guide to e-Science, pp. 323–352. Springer, London (2011)
Sonntag, M., Karastoyanova, D., Leymann, F.: The missing features of workflow systems for scientific computations. In: Proceedings of the 3rd Grid Workflow Workshop (GWW), pp. 209–216. Gesellschaft für Informatik, Paderborn, Germany (2010)
Montagnat, J., Isnard, B., Glatard, T., Maheshwari, K., Blay-Fornarino, M.: A data-driven workflow language for Grids based on array programming principles. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’09), pp. 1–10. ACM, Portland (2009)
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics J. 17(20), 3045–3054 (2004)
Deelman, E., Singh, G., Su, M.-H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. J. 13(3), 219–237 (2005)
Kacsuk, P., Sipos, G.: Multi-Grid, Multi-User Workflows in the P-GRADE Grid Portal. J. Grid Computing (JOGC) 3(3–4), 221–238 (2005)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput.: Pract. Exper. (CCPE) 18(10), 1039–1065 (2006)
Glatard, T., Montagnat, J., Lingrand, D., Pennec, X.: Flexible and efficient workflow deployment of data-intensive applications on Grids with MOTEUR. Int. J. High Perform. Comput. Appl. (IJHPCA) (Special Issue on Workflows Systems in Grid Environments) 22(3), 347–360 (2008)
Zhao, Y., Hategan, M., Clifford, B., Foster, I., von Laszewski, G., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: fast, reliable, loosely coupled parallel computation. In: IEEE International Workshop on Scientific Workflows, Salt Lake City, Utah, USA (2007)
Maheshwari, K., Montagnat, J.: Scientific workflows development using both visual-programming and scripted representations. In: International Workshop on Scientific Workflows (SWF’10), IEEE, Miami (2010)
Guan, Z., Hernández, F., Bangalore, P., Gray, J., Skjellum, A., Velusamy, V., Liu, Y.: Grid-Flow: a Grid-enabled scientific workflow system with a Petri-net-based interface. Concurr. Comput.: Pract. Exper. (CCPE) 18(10), 1115–1140 (2006)
Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 47–56, ACM, New York (2011)
Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: IEEE International Conference on Services Computing (SCC), pp. 449–456 (2010)
Gil, Y., Deelman, E., Ellisman, M.H., Fahringer, T., Fox, G., Gannon, D., Goble, C., Livny, M., Moreau, L., Myers, J.: Examining the challenges of scientific workflows. Computer 40, 24–32 (2007)
McPhillips, T.M., Bowers, S., Zinn, D., Ludäscher, B.: Scientific workflow design for mere mortals. Futur. Gener. Comput. Syst. (FGCS) 25(5), 541–551 (2009)
Bowers, S.: Scientific workflow, provenance, and data modeling challenges and approaches. J. Data Semantics 1(1), 19–30 (2012)
Yu, J., Buyya, R.: A taxonomy of scientific workflow systems for Grid computing. ACM SIGMOD Rec. (SIGMOD) 34(3), 44–49 (2005)
Singh, Y., Sood, M.: The impact of the computational independent model for enterprise information system development. Int. J. Comput. Appl. 11(8), 24–28 (2010)
Cerezo, N., Montagnat, J.: Scientific workflow reuse through conceptual workflows. In: 6th Workshop on Workflows in Support of Large-Scale Science (WORKS’11). ACM, Seattle (2011)
Marion, A., Forestier, G., Liebgott, H., Lartizien, C., Benoit-Cattin, H., Camarasu-Pop, S., Glatard, T., Ferreira Da Silva, R., Clarysse, P., Valette, S., Gibaud, B., Hugonnard, P., Tabary, J., Friboulet, D.: Multi-modality image simulation of biological models within VIP. In: 24th International Symposium on Computer-Based Medical Systems (CBMS), pp. 1–6. Bristol, UK (2011)
Fahringer, T., Prodan, R., Duan, R., Nerieri, F., Podlipnig, S., Qin, J., Siddiqui, M., Truong, H.-L., Villazon, A., Wieczorek, M.: ASKALON: a Grid application development and computing environment. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing (GRID), pp. 122–131. IEEE Computer Society, Washington, DC (2005)
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucl. Instrum. Methods Phys. Res. A 34(Web Server issue), 729–732 (2006)
Taylor, I., Shields, M., Wang, I., Harrison, A.: The Triana Workflow environment: architecture and applications. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, chapter 20, pp. 320–339. Springer (2007)
Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, reloaded. In: SSDBM 2010. Heidelberg, Germany (2010)
Goderis, A., Sattler, U., Lord, P., Goble, C.: Seven bottlenecks to workflow reuse and repurposing. In: The Semantic Web? ISWC 2005 (LNCS), pp. 323–337. Springer, Heidelberg (2005)
Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: abstraction and reuse of object-oriented design. Med. Image Anal. (MedIA) 707, 406–431 (1993)
Gil, Y., Gonzales-Calero, P.A., Kim, J., Moody, J., Ratnakar, V.: A semantic framework for automatic generation of computational workflows using distributed data and component catalogues. J. Exp. Theoret. Artif. Intell. 23(4), 389–467 (2011)
Gil, Y., Ratnakar, V., Jihie, K., Moody, J., Deelman, E., Gonzales-Calero, P.A., Groth, P.: Wings: intelligent workflow-based design of computational experiments. IEEE Intell. Syst. 26(1), 62–72 (2011)
Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.-M., Irwin, J.: Aspect-oriented programming. In: Proceedings of the European Conference on Object-Oriented Programming, vol. 1241, pp. 220–242 (1997)
Ionescu, T.B., Piater, A., Scheuermann, W., Laurien, E.: An aspect-oriented approach for the development of complex simulation software. J. Object Technol. (ETH Zurich) 9(1), 161–181 (2010)
Schumm, D., Karastoyanova, D., Kopp, O., Leymann, F., Sonntag, M., Strauch, S.: Process fragment libraries for easier and faster development of process-based applications. J. Syst. Integr. 2(1), 39–55 (2011)
Schumm, D., Dentsas, D., Hahn, M., Karastoyanova, D., Leymann, F., Sonntag, M.: Web service composition reuse through shared process fragment libraries. In: Web Engineering (ICWE). LNCS, vol. 7387, pp. 498–501. Springer, Berlin (2012)
Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L.: Sweetening Ontologies with DOLCE. In: Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web (LNCS), pp. 166–181. Springer, Berlin (2002)
Forestier, G., Gibaud, B.: Semantic models in VIP. Technical report, INSERM, Rennes, France (2010)
Hauder, M., Gil, Y., Liu, Y., Sethi, R., Jo, H.: Making data analysis expertise broadly accessible through workflows. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science (WORKS), pp. 77–86. ACM, New York (2011)
De Roure, D., Goble, C., Stevens, R.: The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Futur. Gener. Comput. Syst. (FGCS) 25, 561–567 (2008)
Belhajjame, K., Embury, S.M., Paton, N.W., Stevens, R., Goble, C.: Automatic annotation of Web services based on workflow definitions. ACM Trans. Web 2(2), 11:1–11:34 (2008)
Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P., Stevens, R., Goble, C.: The myGrid ontology: bioinformatics service discovery. Int. J. Bioinforma. Res. Appl. (IJBRA) 3, 303–325 (2007)
Belhajjame, K., Wolstencroft, K., Corcho, O., Oinn, T., Tanoh, F., William, A., Goble, C.: Metadata management in the Taverna workflow system. In: 8th IEEE International Symposium on Cluster Computing and the Grid (CCGRID), pp. 651–656 (2008)
Bechhofer, S., de Roure, D., Gamble, M., Goble, C., Buchan, I.: Research objects: towards exchange and reuse of digital knowledge. In: The Future of the Web for Collaborative Science (FWCS) (2010)
Missier, P., Wolstencroft, K., Tanoh, F., Bechhofer, S., Belhajjame, K., Pettifer, S., Goble, C.: Functional units: abstractions for web service annotations. In: 6th World Congress on Services (SERVICES), pp. 306–313 (2010)
Altintas, I., Birnbaum, A., Baldridge, K., Sudholt, W., Miller, M., Amoreira, C., Potier, Y., Ludäscher, B.: A framework for the design and reuse of Grid workflows. In: Scientific Applications of Grid Computing (LNCS), pp. 295–299. Springer (2005)
Bowers, S., Ludäscher, B., Ngu, A.H.H., Critchlow, T.: Enabling scientific workflow reuse through structured composition of dataflow and control-flow. In: IEEE Workshop on Workflow and Data Flow for Scientific Applications (SciFlow). Atlanta, USA (2006)
Chin, G., Sivaramakrishnan, C., Critchlow, T., Schuchardt, K., Ngu, A.H.H.: Scientist-centered workflow abstractions via generic actors, workflow templates, and context-awareness for groundwater modeling and analysis. In: IEEE World Congress on Services (SERVICES), pp. 176–183 (2011)
Sonntag, M., Görlach, K., Karastoyanova, D., Leymann, F., Malets, P., Schumm, D.: Views on scientific workflows. In: Perspectives in Business Informatics Research(LNBIP), pp. 321–335. Springer, Berlin (2011)
Plankensteiner, K., Montagnat, J., Prodan, R.: IWIR: a language enabling portability across Grid workflow systems. In: Workshop on Workflows in Support of Large-Scale Science (WORKS’11). Seattle, USA (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cerezo, N., Montagnat, J. & Blay-Fornarino, M. Computer-Assisted Scientific Workflow Design. J Grid Computing 11, 585–612 (2013). https://doi.org/10.1007/s10723-013-9264-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-013-9264-5