Skip to main content

A Declarative Pipeline Language for Complex Data Analysis

  • Conference paper
Logic-Based Program Synthesis and Transformation (LOPSTR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7844))

Abstract

We introduce BANpipe – a logic-based scripting language designed to model complex compositions of time consuming analyses. Its declarative semantics is described together with alternative operational semantics facilitating goal directed execution, parallel execution, change propagation and type checking. A portable implementation is provided, which supports expressing complex pipelines that may integrate different Prolog systems and provide automatic management of files.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apache ant, http://ant.apache.org/ (accessed November 30, 2012)

  2. Lomsadze, A., Besemer, J., Borodovsky, M.: Genemarks: a self-training method for predicition of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Research 29, 2607–2618 (2001)

    Article  Google Scholar 

  3. Chaiken, R., Jenkins, B., Larson, P.Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: Scope: easy and efficient parallel processing of massive data sets. Proceedings of the VLDB Endowment 1(2), 1265–1276 (2008)

    Google Scholar 

  4. Christiansen, H., Have, C.T., Lassen, O.T., Petit, M.: Bayesian Annotation Networks for Complex Sequence Analysis. In: Technical Communications of the 27th International Conference on Logic Programming, ICLP 2011. Leibniz International Proceedings in Informatics (LIPIcs), vol. 11, pp. 220–230. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2011)

    Google Scholar 

  5. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  6. Durham, A.M., Kashiwabara, A.Y., Matsunaga, F.T.G., Ahagon, P.H., Rainone, F., Varuzza, L., Gruber, A.: Egene: a configurable pipeline generation system for automated sequence analysis. Bioinformatics 21(12), 2812–2813 (2005)

    Article  Google Scholar 

  7. Feldman, S.I.: Make – A program for maintaining computer programs. Software – Practice and Experience 9(3), 255–265 (1979)

    Article  MATH  Google Scholar 

  8. Hoon, S., Ratnapu, K.K., Chia, J.M., Kumarasamy, B., Juguang, X., Clamp, M., Stabenau, A., Potter, S., Clarke, L., Stupka, E.: Biopipe: A flexible framework for protocol-based bioinformatics analysis. Genome Research, 1904–1915 (2003)

    Google Scholar 

  9. Jørgensen, N.: Safeness of make-based incremental recompilation. In: Eriksson, L.-H., Lindsay, P.A. (eds.) FME 2002. LNCS, vol. 2391, pp. 126–145. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  10. Knight, S.: Building software with scons. Computing in Science & Engineering 7(1), 79–88 (2005)

    Article  Google Scholar 

  11. Lassen, O.T.: Compositionality in probabilistic logic modelling for biological sequence analysis. PhD thesis, Roskilde University (2011)

    Google Scholar 

  12. Moura, P.: Logtalk - Design of an Object-Oriented Logic Programming Language. PhD thesis, Department of Computer Science, University of Beira Interior, Portugal (September 2003)

    Google Scholar 

  13. Moura, P.: Programming patterns for logtalk parametric objects. In: Abreu, S., Seipel, D. (eds.) INAP 2009. LNCS (LNAI), vol. 6547, pp. 52–69. Springer, Heidelberg (2011)

    Google Scholar 

  14. Moura, P., Crocker, P., Nunes, P.: Multi-threading programming in Logtalk. In: Abreu, S., Costa, V.S. (eds.) Proceedings of the 7th Colloquium on Implementation of Constraint LOgic Programming Systems, pp. 87–101. University of Oporto, Oporto (2007)

    Google Scholar 

  15. Mungall, C.: Make-like build system based on prolog, https://github.com/cmungall/plmake (accessed November 30, 2012)

  16. Mungall, C.: Skam - skolem assisted makefiles, http://skam.sourceforge.net/ (accessed November 30, 2012)

  17. Noble, W.S.: A quick guide to organizing computational biology projects. PLoS Comput. Biol. 5(7), e1000424 (2009)

    Google Scholar 

  18. Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1099–1110. ACM (2008)

    Google Scholar 

  19. Sato, T., Kameya, Y.: Prism: a language for symbolic-statistical modeling. In: International Joint Conference on Artificial Intelligence, vol. 15, pp. 1330–1339 (1997)

    Google Scholar 

  20. Shapiro, E.: The family of concurrent logic programming languages. ACM Computing Surveys 21(3), 412 (1989)

    Article  Google Scholar 

  21. Stewart, A.C., Osborne, B., Read, T.D.: Diya: a bacterial annotation pipeline for any genomics lab. Bioinformatics 25, 962–963 (2009)

    Article  Google Scholar 

  22. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment 2(2), 1626–1629 (2009)

    Google Scholar 

  23. Ueda, K.: Guarded horn clauses. Technical Report TR-103, ICOT, Tokyo (1985)

    Google Scholar 

  24. Weirich, J.: Rake – ruby make, http://rake.rubyforge.org/ (accessed November 30, 2012)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Christiansen, H., Theil Have, C., Torp Lassen, O., Petit, M. (2013). A Declarative Pipeline Language for Complex Data Analysis. In: Albert, E. (eds) Logic-Based Program Synthesis and Transformation. LOPSTR 2012. Lecture Notes in Computer Science, vol 7844. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38197-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38197-3_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38196-6

  • Online ISBN: 978-3-642-38197-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics