Skip to main content

PROVA: Rule-Based Java-Scripting for a Bioinformatics Semantic Web

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 2994))

Abstract

Transparent information integration across distributed and heterogeneous data sources and computational tools is a prime concern for bioinformatics. Recently, there have been proposals for a semantic web addressing these requirements. A promising approach for such a semantic web are the integration of rules to specify and implement workflows and object-orientation to cater for computational aspects.

We present PROVA, a Java-based rule-engine, which realises this integration. It enables one to separate a declarative description of information workflows from any implementation details and thus easily create and maintain code. We show how PROVA is used to compose an information and computation workflow involving

  • rules for specifying the workflow,

  • rules for reasoning over the data,

  • rules for accessing flat files, databases, and other services, and

  • rules involving heavy-duty computations.

The resulting code is very compact and re-usable.

We give a detailed account of PROVA and document its use with a example of a system, PSIMAP, which derives domain-domain interactions from multi-domain structures in the PDB using the SCOP domain and superfamily definitions. PSIMAP is a typical bioinformatics application in that it integrates disparate information resources in different formats (flat files (PDB) and database (SCOP)) requiring additional computations.

PROVA is available at comas.soi.city.ac.uk/prova

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Badea, L.: Functional discrimination of gene expression patterns in terms of the Gene Ontology. In: Proceedings of the Pacific Symposium on Biocomputing - PSB 2003 (2003)

    Google Scholar 

  2. Bolser, D., Dafas, P., Harrington, R., Park, J., Schroeder, M.: Visualisation and graph-theoretic analysis of the large-scale protein structural interactome network psimap. BMC Bioinformatics 4(45) (2003)

    Google Scholar 

  3. Bryson, K., Luck, M., Joy, M., Jones, D.T.: Applying agents to bioinformatics in geneweaver. In: Cooperative Information Agents, pp. 60–71 (2000)

    Google Scholar 

  4. Badea, L., Tilivea, D.: Integrating biological process modelling with gene expression data and ontologies for functional genomics (position paper). In: Proceedings of the International Workshop on Computational Methods in Systems Biology, University of Trento, Springer, Heidelberg (2003)

    Google Scholar 

  5. Backofen, R., Will, S., Bornberg-Bauer, E.: Application of Constraint Programming Techniques for Structure Prediction of Lattice Proteins with Extended Alphabets. Journal of Bioinformatics 15(3), 234–242 (1999)

    Article  Google Scholar 

  6. Chabrier, N., Fages, F.: Symbolic model checking of biochemical networks. In: Priami, C. (ed.) CMSB 2003. LNCS, vol. 2602, pp. 149–162. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)

    Google Scholar 

  8. Dafas, P., Bolser, D., Gomoluch, J., Park, J., Schroeder, M.: Fast and efficient computation of domain-domain interactions from known protein structures in the PDB. In: Frisch, H.W., Frishman, D., Heun, V., Kramer, S. (eds.) Proceedings of German Conference on Bioinformatics, pp. 27–32 (2003)

    Google Scholar 

  9. Dietrich, J.: Mandarax., http://www.mandarax.org

  10. Fernandes, A.A.A., Williams, M.H., Paton, N., Maria, M.L.: Object-Oriented Database Programming Languages Founded on an Axiomatic Theory of Objects. In: Workshop on Logical Foundations of Object-oriented Programming (1994)

    Google Scholar 

  11. Grütter, R., Eikemeier, C.: Development of a Simple Ontology Definition Language (SOntoDL) and its Application to a Medical Information Service on the World Wide Web. In: Proceedings of the First Semantic Web Working Symposium (SWWS 2001), July/August 2001, pp. 587–597. Stanford University, California (2001)

    Google Scholar 

  12. Grütter, R., Eikemeier, C., Steurer, J.: Towards a Simple Ontology Definition Language (SontoDL) for a Semantic Web of Evidence-Based Medical Information. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS (LNAI), vol. 2101, Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Jarke, M., Gallersdörfer, R., Jeusfeld, M.A., Staudt, M., Eberer, S.: CConceptBase - a deductive object base for meta data management. J. of Intelligent Information Systems (February 1995)

    Google Scholar 

  14. Krippahl, L., Barahona, P.: PSICO: Solving Protein Structures with Constraint Programming and Optimisation. Constraints 7(3/4), 317–331 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  15. Lambrix, P., Edberg, A.: Evaluation of ontology merging tools in bioinformatics. In: Proceedings of the Pacific Symposium on Biocomputing - PSB 2003, pp. 589–600 (2003)

    Google Scholar 

  16. Lambrix, P., Jakoniene, V.: Towards transparent access to multiple biological databanks. In: Proceedings of the First Asia-Pacific Bioinformatics Conference, Adelaide, Australia, pp. 53–60 (2003)

    Google Scholar 

  17. Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995)

    Google Scholar 

  18. Möller, S., Kriventseva, E.V., Apweiler, R.: A collection of well characterised integral membrane proteins. Bioinformatics 16(12), 1159–1160 (2000)

    Article  Google Scholar 

  19. Möller, S., Leser, U., Fleischmann, W., Apweiler, R.: EDITtoTrEMBL: a distributed approach to high-quality automated protein sequence annotation. Bioinformatics 15(3), 219–227 (1999)

    Article  Google Scholar 

  20. Möller, S., Schroeder, M., Apweiler, R.: Conflict-resolution for the automated annotation of transmembrane proteins. Comput. Chem. 26(1), 41–46 (2001)

    Article  Google Scholar 

  21. Palma, P.N., Krippahl, L., Wampler, J.E., Moura, J.J.G.: BiGGER: A new (soft) docking algorithm for predicting protein interactions. Proteins: Structure, Function and Genetics 39, 372–384 (2000)

    Article  Google Scholar 

  22. Stevens, R.D., Robinson, A.J., Goble, C.A.: mygrid: personalised bioinformatics on the information grid. In: Eleventh International Conference on Intelligent Systems for Molecular Biology, vol. 19 (2003)

    Google Scholar 

  23. The PDB Team. The protein data bank. In: Structural Bioinformatics, pp. 181–198. Wiley, Chichester (2003)

    Google Scholar 

  24. Yang, G., Kifer, M.: FLORA: Implementing an efficient DOOD system using a tabling logic engine. In: Palamidessi, C., Moniz Pereira, L., Lloyd, J.W., Dahl, V., Furbach, U., Kerber, M., Lau, K.-K., Sagiv, Y., Stuckey, P.J. (eds.) CL 2000. LNCS (LNAI), vol. 1861, p. 1078. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kozlenkov, A., Schroeder, M. (2004). PROVA: Rule-Based Java-Scripting for a Bioinformatics Semantic Web. In: Rahm, E. (eds) Data Integration in the Life Sciences. DILS 2004. Lecture Notes in Computer Science(), vol 2994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24745-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24745-6_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21300-0

  • Online ISBN: 978-3-540-24745-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics