Ontology Guided Data Integration for Computational Prioritization of Disease Genes

  • Bert Coessens
  • Stijn Christiaens
  • Ruben Verlinden
  • Yves Moreau
  • Robert Meersman
  • Bart De Moor
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4277)


In this paper we present our progress on a framework for collection and presentation of biomedical information through ontology-based mediation. The framework is built on top of a methodology for computational prioritization of candidate disease genes, called Endeavour. Endeavour prioritizes genes based on their similarity with a set of training genes while using a wide variety of information sources. However, collecting information from different sources is a difficult process and can lead to non-flexible solutions. In this paper we describe an ontology-based mediation framework for efficient retrieval, integration, and visualization of the information sources Endeavour uses. The described framework allows to (1) integrate the information sources on a conceptual level, (2) provide transparency to the user, (3) eliminate ambiguity and (4) increase efficiency in information display.


Gene Ontology Ontological Commitment Training Gene Gene Prioritization Prioritization Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hernandez, T., Kambhampati, S.: Integration of biological sources: current systems and challenges ahead. ACM Sigmod Record 33(3), 51–60 (2004)CrossRefGoogle Scholar
  2. 2.
    Stein, L.: Creating a bioinformatics nation. Nature 417(6885), 119–120 (2002)CrossRefGoogle Scholar
  3. 3.
    Stonebraker, M.: Integrating Islands of Information. EAI Journal (September 1999),
  4. 4.
    Sheth, A.: Changing Focus on Interoperability in Information Systems: From System, Syntax, Structure to Semantics. In: Goodchild, M.F., Egenhofer, M.J., Fegeas, R., Kottman, C.A. (eds.) Interoperating Geographic Information Systems, Kluwer Publishers, Dordrecht (1998)Google Scholar
  5. 5.
    Spyns, P., Oberle, D., Volz, R., Zheng, J., Jarrar, M., Sure, Y., Studer, R., Meersman, R.: OntoWeb - a Semantic Web Community Portal. In: Proc. Fourth International Conference on Practical Aspects of Knowledge Management (PAKM), Vienna, Austria (December 2002)Google Scholar
  6. 6.
  7. 7.
    Dzbor, M., Domingue, J., Motta, E.: Magpie - Towards a Semantic Web Browser. In: Fensel, D., Sycara, K.P., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 690–705. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    García, E., Sicilia, M.A.: User Interface Tactics in Ontology-Based Information Seeking. Psychnology e-journal 1(3), 243–256 (2003)Google Scholar
  9. 9.
    Birkland, A., Yona, G.: BIOZON: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinformatics 7, 70–70 (2006)CrossRefGoogle Scholar
  10. 10.
    Etzold, T., Ulyanov, A., Argos, P.: SRS: information retrieval system for molecular biology data banks. Methods Enzymol 266, 114–128 (1996)CrossRefGoogle Scholar
  11. 11.
    Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A., Brass, A.: TAMBIS: transparent access to multiple bioinformatics information sources. Bioinformatics 16, 184–185 (2000)CrossRefGoogle Scholar
  12. 12.
    Davidson, S.B., Crabtree, J., Brunk, B.P., Schug, J., Tannen, V., Overton, G.C., Stoeckert, C.J.: K2/Kleisli and GUS: Experiments in integrated access to genomic data sources. IBM Systems Journal 40, 512–531 (2001)CrossRefGoogle Scholar
  13. 13.
    Haas, L.M., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E., Swope, W.C.: DiscoveryLink: A system for integrated access to life sciences data sources. IBM Systems Journal 40, 489–511 (2001)CrossRefGoogle Scholar
  14. 14.
    Wong, L.: The Collection Programming Language - Reference Manual. Technical Report Kent Ridge Digital Labs, 21 Heng Mui Keng Terrace, Singapore 119613Google Scholar
  15. 15.
    Alashqur, A.M., Su, S.Y.W., Lam, H.: OQL: A Query Language for Manipulating Object-oriented Databases. In: Proceedings of the Fifteenth International Conference on Very Large Data Bases, Amsterdam, The Netherlands, August 22-25, pp. 433–442 (1989)Google Scholar
  16. 16.
    Aerts, S., Lambrechts, D., Maity, S., Van Loo, P., Coessens, B., De Smet, F., Tranchevent, L.C., De Moor, B., Marynen, P., Hassan, B., Carmeliet, P., Moreau, Y.: Gene prioritization via genomic data fusion. Nat. Biotechnol. 24, 537–544 (2006)CrossRefGoogle Scholar
  17. 17.
    Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M.: The KEGG resources for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004)CrossRefGoogle Scholar
  18. 18.
    The Gene Ontology Consortium: Gene Ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000)Google Scholar
  19. 19.
    Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bradley, P., Bork, P., Bucher, P., Cerutti, L., Copley, R., Courcelle, E., Das, U., Durbin, R., Fleischmann, W., Gough, J., Haft, D., Harte, N., Hulo, N., Kahn, D., Kanapin, A., Krestyaninova, M., Lonsdale, D., Lopez, R., Letunic, I., Madera, M., Maslen, J., McDowall, J., Mitchell, A., Nikolskaya, A.N., Orchard, S., Pagni, M., Ponting, C.P., Quevillon, E., Selengut, J., Sigrist, C.J., Silventoinen, V., Studholme, D.J., Vaughan, R., Wu, C.H.: InterPro, progress and status in 2005. Nucleic Acids Res. 33, D201–205 (2005)CrossRefGoogle Scholar
  20. 20.
    Gilbert, D.: Biomolecular interaction network database. Brief Bioinform 6, 194–198 (2005)CrossRefGoogle Scholar
  21. 21.
    Reiter, R.: Towards a Logical Reconstruction of Relational Database Theory. In: Brodie, M., Mylopoulos, J., Schmidt, J. (eds.) On Conceptual Modelling, pp. 191–233. Springer, Heidelberg (1984)Google Scholar
  22. 22.
    Meersman, R.: The Use of Lexicons and Other Computer-Linguistic Tools in Semantics, Design and Cooperation of Database Systems. In: Zhang, Y., Rusinkiewicz, M., Kambayashi, Y. (eds.) Proceedings of the Conference on Cooperative Database Systems (CODAS 1999), pp. 1–14. Springer, Heidelberg (1999)Google Scholar
  23. 23.
    Meersman, R.: Ontologies and Databases: More than a Fleeting Resemblance. In: d’Atri, A., Missikoff, M. (eds.) OES/SEO 2001 Rome Workshop, Luiss Publications (2001)Google Scholar
  24. 24.
    Spyns, P., Meersman, R., Jarrar, M.: Data Modelling versus Ontology Engineering. SIGMOD Record: Special Issue on Semantic Web and Data Management 31(4), 12–17 (2002)Google Scholar
  25. 25.
    Guarino, N., Giaretta, P.: Ontologies and Knowledge Bases: Towards a Terminological Clarification. In: Mars, N. (ed.) Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, pp. 25–32. IOS Press, AmsterdamGoogle Scholar
  26. 26.
    Kasprzyk, A., Keefe, D., Smedley, D., London, D., Spooner, W., Melsopp, C., Hammond, M., Rocca-Serra, P., Cox, T., Birney, E.: EnsMart: a generic system for fast and flexible access to biological data. Genome Res. 14, 160–169 (2004)CrossRefGoogle Scholar
  27. 27.
    Maglott, D., Ostell, J., Pruitt, K.D., Tatusova, T.: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 33, 54–58 (2005)CrossRefGoogle Scholar
  28. 28.
    Meersman, R.: Semantic Web and Ontologies: Playtime or Business at the Last Frontier in Computing? In: NSF-EU Workshop on Database and Information Systems Research for Semantic Web and Enterprises, pp. 61–67 (2002)Google Scholar
  29. 29.
    De Leenheer, P., Meersman, R.: Towards a formal foundation of DOGMA ontology: part I. Technical Report STAR-2005-06, VUB STARLab (2005)Google Scholar
  30. 30.
    De Leenheer, P., de Moor, A., Meersman, R.: Context Dependency Management in Ontology Engineering. Technical Report STARLab, Brussel (2006)Google Scholar
  31. 31.
    Deray, T., Verheyden, P.: Towards a semantic integration of medical relational databases by using ontologies: a case study. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2003. LNCS, vol. 2889, pp. 137–150. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  32. 32.
    Verheyden, P., Deray, T., Meersman, R.: Semantic Mapping of Large and Complex Databases to Ontologies: Methods and Tools. Technical Report 25, STAR Lab, Brussel (2004)Google Scholar
  33. 33.
    Verheyden, P., De Bo, J., Meersman, R.: Semantically unlocking database content through ontology-based mediation. In: Bussler, C.J., Tannen, V., Fundulaki, I. (eds.) SWDB 2004. LNCS, vol. 3372, pp. 109–126. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  34. 34.
    Trog, D., Vereecken, J.: Context-driven Visualization for Ontology Engineering. Master thesis, Vrije Universiteit Brussel (2006)Google Scholar
  35. 35.
    Halpin, T.: Information Modeling and Relational Databases. Morgan Kaufmann, San Francisco (2001)Google Scholar
  36. 36.
    Barriot, R., Poix, J., Groppi, A., Barré, A., Goffard, N., Sherman, D., Dutour, I., de Daruvar, A.: New strategy for the representation and the integration of biomolecular knowledge at a cellular scale. Nucleic Acids Res. 32(12), 3581–3589 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Bert Coessens
    • 1
  • Stijn Christiaens
    • 2
  • Ruben Verlinden
    • 2
  • Yves Moreau
    • 1
  • Robert Meersman
    • 2
  • Bart De Moor
    • 1
  1. 1.Department of Electrical EngineeringKatholieke Universiteit Leuven 
  2. 2.Semantics Technology and Applications Research LaboratoryVrije Universiteit Brussel 

Personalised recommendations