Advertisement

Taverna, Reloaded

  • Paolo Missier
  • Stian Soiland-Reyes
  • Stuart Owen
  • Wei Tan
  • Alexandra Nenadic
  • Ian Dunlop
  • Alan Williams
  • Tom Oinn
  • Carole Goble
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6187)

Abstract

The Taverna workflow management system is an open source project with a history of widespread adoption within multiple experimental science communities, and a long-term ambition of effectively supporting the evolving need of those communities for complex, data-intensive, service-based experimental pipelines. This short paper describes how the recently overhauled technical architecture of Taverna addresses issues of efficiency, scalability, and extensibility, and presents performance results based on a collection of synthetic workflows, as well as a concrete case study involving a production workflow in the area of cancer research.

Keywords

Execution Time Memory Usage Input Port Execution Model Concurrent Thread 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Couvares, P., Kosar, T., Roy, A., Weber, J., Wenger, K.: Workflow M. In: Workflows for e-Science, Springer, Heidelberg (2007)Google Scholar
  2. 2.
    Deelman, E., Chervenak, A.L.: Data Management Challenges of Data-Intensive Scientific Workflows. In: CCGRID, pp. 687–692 (2008)Google Scholar
  3. 3.
    Deelman, E., Singh, G., Su, M.-H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Bruce Berriman, G., Good, J., Laity, A.C., Jacob, J.C., Katz, D.S.: Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Scientific Programming 13(3), 219–237 (2005)Google Scholar
  4. 4.
    Fisher, P., Hedeler, C., Wolstencroft, K., Hulme, H., Noyes, H., Kemp, S., Stevens, R., Brass, A.: A systematic strategy for large-scale analysis of genotype phenotype correlations: identification of candidate genes involved in trypanosomiasis. Nucleic Acids Research 35, 5625–5633 (2007)CrossRefGoogle Scholar
  5. 5.
    Foster, I.T., Vöckler, J.-S., Wilde, M., Zhao, Y.: Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation. In: SSDBM, pp. 37–46. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  6. 6.
    Gil, Y., Deelman, E., Ellisman, M., Fahringer, T., Fox, G., Gannon, D., Goble, C., Livny, M., Moreau, L., Myers, J.: Examining the Challenges of Scientific Workflows. Computer 40, 24–32 (2007)CrossRefGoogle Scholar
  7. 7.
    Hull, D., Wolstencroft, K., Stevens, R., Goble, C.A., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Research 34, 729–732 (2006)CrossRefGoogle Scholar
  8. 8.
    Hwang, K., Briggs, F.A.: Computer architecture and parallel processing. McGraw-Hill, New York (1986)Google Scholar
  9. 9.
    Joel, S., Tahsin, K., Shannon, H., Stephen, L., Scott, O., et al.: e-Science, caGrid, and Translational Biomedical Research. Computer 41, 58–66 (2008)Google Scholar
  10. 10.
    Lee, E.A.: Dataflow Process Networks. Memorandum, UC Berkeley EECS Dept. (1994)Google Scholar
  11. 11.
    Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., Kutok, J.L., Aguiar, R.C.T.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine 8, 68–74 (2002)CrossRefGoogle Scholar
  12. 12.
    Missier, P., Paton, N., Belhajjame, K.: Fine-grained and efficient lineage querying of collection-based workflow provenance. In: Procs. EDBT, Lausanne, Switzerland (2010)Google Scholar
  13. 13.
    Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics, 3045–3054 (November 2004)Google Scholar
  14. 14.
    Pautasso, C., Alonso, G.: Parallel Computing Patterns for Grid Workflows. In: Proc. of the HPDC 2006 Workshop on Workflows in Support of Large-Scale Science (WORKS 2006), Paris, France (2006)Google Scholar
  15. 15.
    Smedley, D., Haider, S., Ballester, B., Holland, R., London, D., Thorisson, G., Kasprzyk, A.: BioMart – biological queries made easy. BMC Genomics 10 (2009)Google Scholar
  16. 16.
    Turi, D., Missier, P., De Roure, D., Goble, C., Oinn, T.: Taverna Workflows: Syntax and Semantics. In: Proceedings of the 3rd e-Science conference, Bangalore, India (December 2007)Google Scholar
  17. 17.
    van der Aalst, W.M.P., ter Hofstede, A.H.M., Kiepuszewski, B., Barros, A.P.: Workflow Patterns. Distributed and Parallel Databases 14, 5–51 (2003)CrossRefGoogle Scholar
  18. 18.
    Foster, W.T.I., Madduri, R.: Combining the Power of Taverna and caGrid: Scientific Workflows that Enable Web-Scale Collaboration. IEEE Internet Computing 12, 61–68 (2008)CrossRefGoogle Scholar
  19. 19.
    Walker, E., Xu, W., Chandar, V.: Composing and executing parallel data-flow graphs with shell pipes. In: WORKS 2009: Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, pp. 1–10. ACM, New York (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Paolo Missier
    • 1
  • Stian Soiland-Reyes
    • 1
  • Stuart Owen
    • 1
  • Wei Tan
    • 2
  • Alexandra Nenadic
    • 1
  • Ian Dunlop
    • 1
  • Alan Williams
    • 1
  • Tom Oinn
    • 3
  • Carole Goble
    • 1
  1. 1.School of Computer ScienceThe University of ManchesterUK
  2. 2.Mathematics and Computer Science DivisionArgonne National LaboratoryArgonneUSA
  3. 3.European Bioinformatics InstituteUK

Personalised recommendations