Advertisement

Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life

  • Shawn Bowers
  • Timothy McPhillips
  • Sean Riddle
  • Manish Kumar Anand
  • Bertram Ludäscher
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5272)

Abstract

The complexity of scientific workflows for analyzing biological data creates a number of challenges for current workflow and provenance systems. This complexity is due in part to the nature of scientific data (e.g., heterogeneous, nested data collections) and the programming constructs required for automation (e.g., nested workflows, looping, pipeline parallelism). We present an extended version of the Kepler scientific workflow system to address these challenges, tailored for the systematics community. Our system combines novel approaches for representing scientific data, modeling and automating complex analyses, and recording and browsing associated provenance information.

Keywords

Execution Trace Read Scope Data Provenance Common Data Model Provenance Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ludäscher, B., et al.: Scientific workflow management and the kepler system. Concurrency and Computation: Practice & Experience 18(10), 1039–1065 (2006)CrossRefGoogle Scholar
  2. 2.
    McPhillips, T., Bowers, S., Ludäscher, B.: Collection-oriented scientific workflows for integrating and analyzing biological data. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 248–263. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    McPhillips, T., Bowers, S., Zinn, D., Ludäscher, B.: Scientific workflow design for mere mortals. FGCS (to appear, 2008)Google Scholar
  4. 4.
    Majithia, S., Shields, M.S., Taylor, I.J., Wang, I.: Triana: A graphical web service composition and execution toolkit. In: ICWS (2004)Google Scholar
  5. 5.
    Oinn, T., et al.: Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice & Experience 18(10), 1067–1100 (2006)CrossRefGoogle Scholar
  6. 6.
    Bavoil, L., Callahan, S.P., Scheidegger, C.E., Vo, H.T., Crossno, P., Silva, C.T., Freire, J.: VisTrails: Enabling interactive multiple-view visualizations. In: IEEE Visualization (2005)Google Scholar
  7. 7.
    Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Lee, E.A., Sangiovanni-Vincentelli, A.L.: A framework for comparing models of computation. IEEE Trans. on CAD of Integrated Circuits and Systems 17(12) (1998)Google Scholar
  9. 9.
    Moreau, L., Ludäscher, B. (eds.): Computation and Concurrency: Practice and Experience, vol. 20(5). Wiley, Chichester (2008)Google Scholar
  10. 10.
    Moreau, L., Freire, J., Futrelle, J., McGrath, R., Myers, J., Paulson, P.: The open provenance model. Technical Report 14979, University of Southampton (2007)Google Scholar
  11. 11.
    Biton, O., Boulakia, S.C., Davidson, S.B.: Zoom*userviews: Querying relevant provenance in workflow systems. In: VLDB (2007)Google Scholar
  12. 12.
    Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in Collection-Oriented Scientific Workflows. Concurrency and Computation: Practice and Experience (2007)Google Scholar
  13. 13.
    Bowers, S., McPhillips, T.M., Wu, M., Ludäscher, B.: Project histories: Managing data provenance across collection-oriented scientific workflow runs. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS (LNBI), vol. 4544, pp. 122–138. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Shawn Bowers
    • 1
  • Timothy McPhillips
    • 1
  • Sean Riddle
    • 1
  • Manish Kumar Anand
    • 2
  • Bertram Ludäscher
    • 1
    • 2
  1. 1.UC Davis Genome CenterUniversity of CaliforniaDavisUSA
  2. 2.Department of Computer ScienceUniversity of CaliforniaDavisUSA

Personalised recommendations