An Overview of the BioExtract Server: A Distributed, Web-Based System for Genomic Analysis
Genome research is becoming increasingly dependent on access to multiple, distributed data sources, and bioinformatic tools. The importance of integration across distributed databases and Web services will continue to grow as the number of requisite resources expands. Use of bioinformatic workflows has seen considerable growth in recent years as scientific research becomes increasingly dependent on the analysis of large sets of data and the use of distributed resources. The BioExtract Server (http://bioextract.org) is a Web-based system designed to aid researchers in the analysis of distributed genomic data by providing a platform to facilitate the creation of bioinformatic workflows. Scientific workflows are created within the system by recording the analytic tasks preformed by researchers. These steps may include querying multiple data sources, saving query results as searchable data extracts, and executing local and Web-accessible analytic tools. The series of recorded tasks can be saved as a computational workflow simply by providing a name and description.
KeywordsDatabase integration Genomic analysis Scientific provenance Scientific workflows Web services
The BioExtract Server project is currently supported in part by the National Science Foundation grant DBI-0606909.
- 5.E. Deelman and Y. Gil, Workshop on the Challenges of Scientific Workflows; Sponsored by the National Science Foundation, http://vtcpc.isi.edu/wiki/images/3/3a/NSFWorkflowFinal.pdf, May 1–2, 2006Google Scholar
- 7.D. De Roure, C. Goble, and R. Stevens, The design and realization of the myExperiment Virtual Research Environment for social sharing of workflows, Future Generation Computer Systems, 25(5):561–567, 2009. corrected proof available as: DOI http://dx.doi.org/10.1016/j.future.2008.06.010 CrossRefGoogle Scholar
- 12.S. Bowers, T. McPhillips, B. Ludäscher, S.Cohen, and S. Davidson, A Model for user-oriented data provenance in pipelined scientific workflows, Lecture Notes in Computer Science, Springer, Berlin, ISBN: 978-3-540-46302-3, pp 133–147Google Scholar
- 13.C. Goble, Position statement: musings on provenance, workflow and (semantic web) annotations for bioinformatics, Proceedings of the Workshop on Data Derivation and Provenance, 2002; http://people.cs.uchicago.edu/yongzh/papers/provenance_workshop_3.doc
- 14.L. Moreau, B Ludäscher, I. Altintas, R. Barga, S. Bowers, , S. Callahan, G. Chin, B. Clifford, S. Cohen, S. Cohen-Boulakia, S. Davidson, E. Deelman, L. Digiampietri, I. Foster, J. Freire, J. Frew, J. Futrelle, T. Gibson, Y. Gil, C. Goble, J. Golbeck, P. Groth, D. A. Holland, S. Jiang, J. Kim, D. Koop, A. Krenek, T. McPhillips, G. Mehta, S. Miles, D. Metzger, S. Munroe, J. Myers, B. Plale, N. Podhorszki, V. Ratnakar, E. Santos, C. Scheidegger, K. Schuchardt, M. Seltzer, Y. Simmhan, C. Silva, P. Slaughter, E. Stephan, R. Stevens, D. Turi, H. Vo, M. Wilde, J. Zhao, and Y. Zhao, The First Provenance Challenge, Concurrency and Computation: Practice & Experience, 20(5):409–418, 2008CrossRefGoogle Scholar
- 15.L. Moreau, J. Futrelle, R. McGrath, J. Myers, and P. Pualson, The open provenance model: an overview, Lecture Notes in Computer Science, Springer, Berlin/Heidelberg, ISBN 978-3-540-89964-8, 5272:323–326, 2008Google Scholar