Identifying Gene Ontology Areas for Automated Enrichment

  • Catia Pesquita
  • Tiago Grego
  • Francisco Couto
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5518)


Biomedical ontologies provide a commonly accepted scheme for the characterization of biological concepts that enable knowledge sharing and integration. Updating and maintaining an ontology requires highly specialized experts and is very time-consuming given the amount of literature that has to be analyzed and the difficulty in reaching consensus.

This paper outlines a proposal for the development of automated processes for the enrichment of the Gene Ontology (GO) that will use text mining techniques and ontology alignment techniques to extract new terms and relations. We also identify the areas of GO whose level of detail is too low to answer the community’s needs at large. We have found that although GO’s content is well suited to the manual annotations, revealing the coordination between GO developers and GO annotators, there are 17 areas that would benefit from enrichment to support electronic annotation efforts.

With this work we hope to provide biomedical researchers with an extended version of GO that can be used ’as is’ or by GO developers as a starting point to enrich GO.


Biomedical ontologies ontology enrichment text mining ontology alignment 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andreas Faatz, R.S.: Ontology enrichment with texts from the www. In: Semantic Web Mining, WS 2002 (2002)Google Scholar
  2. 2.
    Bada, M., Stevens, R., Goble, C., Gil, Y., Ashburner, M., Blake, J., Cherry, J., Harris, M., Lewis, S.: A short study on the success of the gene ontology. Journal of Web Semantics 1(1), 235–240 (2004)CrossRefGoogle Scholar
  3. 3.
    Bisson, G., Ndellec, C., Caamero, D.: Designing clustering methods for ontology building - the mok workbench. In: Proc. ECAI Ontology Learning Workshop (2000)Google Scholar
  4. 4.
    Bodenreider, O., Stevens, R.: Bio-ontologies: current trends and future directions. Brief Bioinform 7(3) (September 2006)Google Scholar
  5. 5.
    Camon, E., Barrell, D., Dimmer, E., Lee, V., Magrane, M., Maslen, J., Binns, D., Apweiler, R.: An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6(suppl. 1), S17 (2005)CrossRefGoogle Scholar
  6. 6.
    Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Research 32, D262 (2004)CrossRefGoogle Scholar
  7. 7.
    Faure, D., Edellec, C.N.: A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: LREC workshop on (1998)Google Scholar
  8. 8.
    GO-Consortium. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 32(Database issue), D258–D261 (2004)Google Scholar
  9. 9.
    Hearst, M.: Automated Discovery of WordNet Relations. MIT Press, Cambridge (1998)Google Scholar
  10. 10.
    Hill, D.P., Blake, J.A., Richardson, J.E., Ringwald, M.: Extension and integration of the gene ontology (go): Combining go vocabularies with external vocabularies. Genome Res. 12(12) (December 2002)Google Scholar
  11. 11.
    Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence research 24 (2005)Google Scholar
  12. 12.
    Jannink, J., Wiederhold, G.: Ontology maintenance with an algebraic methodology: a case study. In: Proc. AAAI workshop on Ontology Management (1999)Google Scholar
  13. 13.
    Lee, J.B., Kim, J.J., Park, J.C.: Automatic extension of gene ontology with flexible identification of candidate terms. Bioinformatics 22(6) (March 2006)Google Scholar
  14. 14.
    Maria Ruiz-Casado, E.A., Castells, P.: Automatising the learning of lexical patterns: An application to the enrichment of wordnet by extracting semantic relationships from wikipedia. Data & Knowledge Engineering 61 (2007)Google Scholar
  15. 15.
    Pesquita, C., Faria, D., Bastos, H., Ferreira, A., Falcao, A., Couto, F.: Metrics for go based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 9(suppl. 5), S4(2008)Google Scholar
  16. 16.
    Reiter, N., Buitelaar, P.: Lexical enrichment of a human anatomy ontology using wordnet. In: Proc. Global WordNet Conference (GWC) (2008)Google Scholar
  17. 17.
    Roux, C., Proux, D., Rechenmann, F., Julliard, L.: An ontology enrichment method for a pragmatic information extraction system gathering data on genetic interactions. In: Proc. ECAI Ontology Learning Workshop (2000)Google Scholar
  18. 18.
    Sanchez, D., Moreno, A.: Learning medical ontologies from the web. In: Proc. 11th Conference on Artificial Intelligence in Medicine (AIME 2007) (2007)Google Scholar
  19. 19.
    Staab, S.: Learning concept hierarchies from text with a guided agglomerative clustering algorithm. In: Proc. Workshop on Learning and Extending Lexical Ontologies with Machine Learning Methods (2005)Google Scholar
  20. 20.
    Parekh, T.F.V., Gwo, J.: Mining domain specific texts and glossaries to evaluate and enrich domain ontologies. In: International Conference of Information and Knowledge Engineering (2004)Google Scholar
  21. 21.
    Valarakos, R.G., Paliouras, G., Karkaletsis, V., Vouros, G.: A name-matching algorithm for supporting ontology enrichment. In: Vouros, G., Panayiotopoulos, T. (eds.) SETN 2004. LNCS, vol. 3025, pp. 381–389. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Catia Pesquita
    • 1
  • Tiago Grego
    • 1
  • Francisco Couto
    • 1
  1. 1.LaSIGE, Universidade de LisboaLisboaPortugal

Personalised recommendations