Skip to main content

Modeling Genome Data Processing Pipelines

  • Chapter
  • First Online:
High-Performance In-Memory Genome Data Analysis

Part of the book series: In-Memory Data Management Research ((IMDM))

Abstract

In order to conduct analyses on genome data, different calculation steps have to be done in a specific order, which constitutes a genome data processing pipeline. Still a lot of research is in process, in order to find faster and more reliable ways to do various analyses, so single steps or the whole sequence of the pipelines might be subject to change.Amodular and flexible way to configure pipelines could simplify their use and the sharing of pipelines between researchers. With a possibility to configure pipelines without altering source code, bioinformaticians and technicians would be relieved of the task to rewrite a pipeline every time a single algorithm changes. This contribution proposes to use common process modeling tools for the abstract representation of genome data processing pipelines. The benefits and drawbacks of different process model notations are examined with special focus on the possibilities to specify execution semantics. As a prototype, a system for the parsing and execution of genome data processing pipelines specified in business process model and notation, is introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. van der Aalst WMP (1998) The Application of Petri Nets toWorkflow Management. The Journal of Circuits, Systems and Computers 8(1):21–66

    Article  Google Scholar 

  2. Booch G, Rumbaugh J, Jacobson I (1998) The Unified Modeling Language User Guide, 1st edn. Addison-Wesley, Reading, MA

    Google Scholar 

  3. Dijkman RM, Gorp PV (2011) BPMN 2.0 Execution Semantics Formalized as Graph Rewrite Rules. In: Business Process Modeling Notation, Lecture Notes in Business Information Processing, vol 67, Springer, pp 16–30

    Google Scholar 

  4. Flicek P et al. (2012) Ensembl 2013. Nucleic Acids Research 41(D1):D48–D55

    Google Scholar 

  5. Goecks J, Nekrutenko A, Taylor J (2010) Galaxy: A Comprehensive Approach for Supporting Accessible, Reproducible, and Transparent Computational Research in the Life Sciences. Genome Biology 11(8)

    Google Scholar 

  6. Huser V et al. (2011) Implementation of Workflow Engine Technology to Deliver Basic Clinical Decision Support Functionality. BMC Medical Research Methodology 11(1):43

    Article  PubMed  Google Scholar 

  7. Ko RKL, Lee SSG, Lee EW (2009) Business Process Management (BPM) Standards: A Survey. Business Process Management Journal 15(5):744–791

    Article  Google Scholar 

  8. Marshall C (1999) Enterprise Modeling with UML – Designing Successful Software Through Business Analysis. Addison-Wesley

    Google Scholar 

  9. Object Management Group (2011) Business Process Model and Notation (BPMN). http://www.omg.org/spec/BPMN/2.0/PDF/. Accessed Sep 23, 2013

  10. Rumbaugh J, Jacobson I, Booch G (2004) Unified Modeling Language Reference Manual, The (2nd Edition). Pearson Higher Education

    Google Scholar 

  11. Russell N et al. (2006) On the Suitability of UML 2.0 Activity Diagrams for Business Process Modelling. In: Proceedings of the 3rd Asia-Pacific Conference on Conceptual Modelling, Australian Computer Society, Hobart, Australia, vol 53, pp 95–104

    Google Scholar 

  12. Salimifard K, Wright M (2001) Petri Net-based Modelling ofWorkflow Systems: An Overview. European Journal of Operational Research 134(3):664–676

    Article  Google Scholar 

  13. WeskeM (2007) Business Process Management – Concepts, Languages, Architectures. Springer

    Google Scholar 

  14. Wheeler DL et al. (2008) Database Resources of the National Center for Biotechnology Information. Nucleic Acids Research 36:D13–D21

    Article  PubMed  CAS  Google Scholar 

  15. Wong PYH, Gibbons J (2008) A Process Semantics for BPMN. In: Liu S, Maibaum T, Araki K (eds) Formal Methods and Software Engineering, no. 5256 in Lecture Notes in Computer Science, Springer Berlin Heidelberg, pp 355–374

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marie Schäffer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Schäffer, M. (2014). Modeling Genome Data Processing Pipelines. In: Plattner, H., Schapranow, MP. (eds) High-Performance In-Memory Genome Data Analysis. In-Memory Data Management Research. Springer, Cham. https://doi.org/10.1007/978-3-319-03035-7_2

Download citation

Publish with us

Policies and ethics