Ontologies and Text Mining as a Basis for a Semantic Web for the Life Sciences

  • Andreas Doms
  • Vaida Jakonienė
  • Patrick Lambrix
  • Michael Schroeder
  • Thomas Wächter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4126)


The life sciences are a promising application area for semantic web technologies as there are large online structured and unstructured data repositories and ontologies, which structure this knowledge. We briefly give an overview over biomedical ontologies and show how they can help to locate, retrieve, and integrate biomedical data. Annotating literature with ontology terms is an important problem to support such ontology-based searches. We review the steps involved in this text mining task and introduce the ontology-based search engine GoPubMed. As the underlying data sources evolve, so do the ontologies. We give a brief overview over different approaches supporting the semi-automatic evolution of ontologies.


Gene Ontology Ontology Term Mapping Rule Ontology Evolution Open Biomedical Ontology 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [AS05]
    Abecker, A., Stojanovic, L.: Ontology evolution: Medline case study (2005)Google Scholar
  2. [BHL01]
    Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (May 2001)Google Scholar
  3. [BAB05]
    Bodenreider, O., Aubry, M., Burgun, A.: Non-lexical approaches to identifying associative relations in the gene ontology. In: Pacific Symposium on Biocomputing, vol. 10, pp. 91–102 (2005)Google Scholar
  4. [Bri93]
    Brill, E.: A corpus-based approach to language learning. Univeristy of Pennsylvania (1993)Google Scholar
  5. [CFP06]
    Cohen-Boulakia, S., Froidevaux, C., Pietriga, E.: Selecting biological data sources and tools with XPR, a path language for RDF. In: Pacific Symposium on Biocomputing, vol. 11, pp. 116–127 (2006)Google Scholar
  6. [CGG03]
    Collins, F., Green, E., Guttmacher, A., Guyer, M.: A vision for the future of genomics research. Nature 422, 835–847 (2003)CrossRefGoogle Scholar
  7. [DDK04]
    Delfs, R., Doms, A., Kozlenkov, A., Schroeder, M.: GoPubMed: ontology-based literature search applied to Gene Ontology and PubMed. In: German Conference on Bioinformatics (2004)Google Scholar
  8. [DS05]
    Doms, A., Schroeder, M.: GoPubMed: exploring PubMed with the Gene Ontology. Nucleic Acids Res. 33, 783–786 (2005)CrossRefGoogle Scholar
  9. [FP00]
    Faure, D., Poibeau, T.: First experiments of using semantic knowledge learned by asium for information extraction task using intex. In: Proceedings of the ECAI Workshop on Ontology Learning (2000)Google Scholar
  10. [GM97]
    Garfield, E., Melino, G.: The growth of the cell death field: An analysis from the ISI science citation index. Cell Death and Differentiation 4, 352–361 (1997)CrossRefGoogle Scholar
  11. [GM03]
    Gómez-Pérez, A., Manzano-Macho, D. (eds.): A survey of ontology learning methods and techniques. Universidad Politecnica de Madrid (2003)Google Scholar
  12. [GO]
    The Gene Ontology Consortium. Gene Ontology: Tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000), Google Scholar
  13. [Gom99]
    Gómez-Pérez, A.: Ontological Engineering: A state of the Art. Expert Update 2(3), 33–43 (1999)Google Scholar
  14. [GSN01]
    Goble, C.A., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., Brass, A.: Transparent access to multiple bioinformatics information sources. IBM Systems Journal 40(2) (2001)Google Scholar
  15. [GTT06]
    Good, B., Tranfield, E., Tan, P., Shehata, M., Singhera, G., Gosselink, J., Okon, E., Wilkinson, M.: Fast, cheap and out of control: A zero curation model for ontology development. In: Pacific Symposium on Biocomputing, vol. 11, pp. 128–139 (2006)Google Scholar
  16. [Hea92]
    Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the Fourteenth International Conference on Computational Linguistic (1992)Google Scholar
  17. [Jak05]
    Jakonienė, V.: A Study in Integrating Multiple Biological Data Sources. Licentiate thesis No 1149, Linköpings universitet, Sweden (2005)Google Scholar
  18. [JBR99]
    Jacobson, I., Booch, G., Rumbaugh, J.: The unified software development process. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)Google Scholar
  19. [JL05]
    Jakoniene, V., Lambrix, P.: Ontology-based Integration for Bioinformatics. In: Collard, M. (ed.) ODBIS 2005/2006. LNCS, vol. 4623, pp. 55–58. Springer, Heidelberg (2007)Google Scholar
  20. [KMV00]
    Kietz, J., Maedche, A., Volz, R.: A method for semi-automatic ontology acquisition from a corporate intranet. In: EKAW 2000 Workshop on Ontologies and Texts. CEUR Workshop Proceedings, vol. 51, pp. 4.1–4.14 (2000)Google Scholar
  21. [KPL03]
    Köhler, J., Philippi, S., Lange, M.: SEMEDA: ontology based semantic integration of biological databases. Bioinformatics 19(18), 2420–2427 (2003)CrossRefGoogle Scholar
  22. [KS05]
    Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: The state of the art. In: Semantic Interoperability and Integration, number 04391 in Dagstuhl Seminar. Proceedings of Internationales Begegnungs- und Forschungszentrum (IBFI), Schloss Dagstuhl, Germany (2005) [date of citation: 2005-01-01],
  23. [Lam04]
    Lambrix, P.: Ontologies in Bioinformatics and Systems Biology. In: Dubitzky, W., Azuaje, F. (eds.) Artificial Intelligence Methods and Tools for Systems Biology, ch. 8, pp. 129–146. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  24. [Lam05]
    Lambrix, P.: Towards a Semantic Web for Bioinformatics using Ontology-based Annotation. In: Proceedings of the 14th IEEE International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises, pp. 3–7 (2005) (Invited talk)Google Scholar
  25. [LGM03]
    Ludäscher, B., Gupta, A., Martone, M.E.: A Model-Based Mediator System for Scientific Data Management. In: Lacroix, Z., Critchlow, T. (eds.) Bioinformatics: Managing Scientific Data, ch. 12, pp. 335–370. Morgan Kaufmann Publishers, San Francisco (2003)Google Scholar
  26. [LJ03]
    Lambrix, P., Jakonienė, V.: Towards Transparent Access to Multiple Biological Databanks. In: Proceedings of the Asia-Pacific Bioinformatics Conference, pp. 53–60 (2003)Google Scholar
  27. [LMN04]
    Lacroix, Z., Murthy, H., Naumann, F., Raschid, L.: Links and Paths through Life Science Data Sources. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 203–211. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  28. [LT06]
    Lambrix, P., Tan, H.: SAMBO - A System for Aligning and Merging Biomedical Ontologies. Journal of Web Semantics (to appear, 2006)Google Scholar
  29. [MBS03]
    Maedche, A., Motik, B., Stojanovic, L.: Managing multiple and distributed ontologies on the semantic web. VLDB 12, 286–302 (2003)CrossRefGoogle Scholar
  30. [Mor99]
    Morin, E.: Automatic acquisition of semantic relations between terms from technical corpora. In: Proc. Of the Fifth Int. Congress on Terminology and Knowledge Engineering (TKE 1999), Vienna, TermNet-Verlag (1999)Google Scholar
  31. [MS00]
    Maedche, A., Staab, S.: Discovering conceptual relations from text. In: Proceedings of the 14th European Conference on Artificial Intelligence, ECAI 2000, pp. 21–25 (2000)Google Scholar
  32. [MWL03]
    Miled, Z.B., Webster, Y.W., Liu, Y., Li, N.: An Ontology for Semantic Integration of Life Science Web Databases. International Journal of Cooperative Information Systems 12(2), 275–294 (2003)CrossRefGoogle Scholar
  33. [NAR]
    Nucleic Acids Research,
  34. [NM00]
    Noy, N., Musen, M.: PROMPT: Algorithm and tool for automated ontology merging and alignment. AAAI/IAAI, pp. 450–455 (2000)Google Scholar
  35. [OBO]
    Open Biomedical Ontologies,
  36. [OCA04]
    Ogren, P., Cohen, K., Acquaah-Mensah, G., Eberlein, J., Hunter, L.: The compositional structure of gene ontology terms. In: Pac. Symp. Biocomput., pp. 214–225 (2004)Google Scholar
  37. [PH04]
    Patel-Schneider, P., Horrocks, I.: Owl web ontology language semantics and abstract syntax (2004)Google Scholar
  38. [PM]
  39. [REWERSE]
  40. [SC03]
    Smith, T., Cleary, J.: Automatically linking MEDLINE abstracts to the GeneOntology. In: Proc. of the Sixth Annual Bio-Ontologies Meeting (2003)Google Scholar
  41. [SM01]
    Stumme, G., Maedche, A.: FCA-MERGE: Bottom-up merging of ontologies. In: IJCAI, pp. 225–234 (2001)Google Scholar
  42. [SM02]
    Stojanovic, L., Motik, B.: Ontology evolution within ontology editors. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 53–62. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  43. [SOFG]
    Standards and Ontologies for Functional Genomics,
  44. [Swa86]
    Swanson, D.R.: Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine (1986)Google Scholar
  45. [TH99]
    Thomas, D., Hunt, A.: The Pragmatic Progarmmer: From Journeyman to Master. Addison-Wesley, Reading (1999)Google Scholar
  46. [UG96]
    Uschold, M., Grüninger, M.: Ontologies: principles, methods, and applications. Knowledge Engineering Review 11(2), 93–155 (1996)CrossRefGoogle Scholar
  47. [VOS03]
    Volz, R., Oberle, D., Staab, S., Motik, B.: Kaon server - a semantic web management system. In: Proceedings of the Twelfth International World Wide Web Conference (2003)Google Scholar
  48. [W3C-sw]
    World Wide Web Consortium. Semantic Web (2001),
  49. [W3C-ws]
    World Wide Web Consortium. Web Services,

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Andreas Doms
    • 1
  • Vaida Jakonienė
    • 2
  • Patrick Lambrix
    • 2
  • Michael Schroeder
    • 1
  • Thomas Wächter
    • 1
  1. 1.Biotechnological CentreTechnische Universität DresdenGermany
  2. 2.Department of Computer and Information ScienceLinköpings universitetSweden

Personalised recommendations