Annotation of Gene Products in the Literature with Gene Ontology Terms Using Syntactic Dependencies

  • Jung-jae Kim
  • Jong C. Park
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3248)

Abstract

We present a method for automatically annotating gene products in the literature with the terms of Gene Ontology (GO), which provides a dynamic but controlled vocabulary. Although GO is well-organized with such lexical relations as synonymy, ‘is-a’, and ‘part-of’ relations among its terms, GO terms show quite a high degree of morphological and syntactic variations in the literature. As opposed to the previous approaches that considered only restricted kinds of term variations, our method uncovers the syntactic dependencies between gene product names and ontological terms as well in order to deal with real-world syntactic variations, based on the observation that the component words in an ontological term usually appear in a sentence with established patterns of syntactic dependencies.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bader, G.D., Betel, D., Hogue, C.W.: BIND: the Biomolecular Interaction Network Database. Nucl. Acids. Res. 31(1), 248–250 (2003)CrossRefGoogle Scholar
  2. 2.
    Camon, E., Barrell, B., Brooksbank, C., Magrane, M., Apweiler, R.: The Gene Ontology Annotation (GOA) Project: Application of GO in SWISS-PROT, TrEMBL and InterPro. Comp. Funct. Genom. 4, 71–74 (2003)CrossRefGoogle Scholar
  3. 3.
    Chiang, J.H., Yu, H.C.: MeKE: discovering the functions of gene products from biomedical literature via sentence alignment. Bioinformatics 19(11), 1417–1422 (2003)CrossRefGoogle Scholar
  4. 4.
    Friedman, C., Kra, P., Yu, H., Krauthammer, M., Rzhetsky, A.: GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics 17(suppl. 1), S74–S82 (2001)Google Scholar
  5. 5.
    The Gene Ontology Consortium: Gene Ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000)Google Scholar
  6. 6.
    Hua, S., Sun, Z.: Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17(8), 721–728 (2001)CrossRefGoogle Scholar
  7. 7.
    Jacquemin, C.: Syntagmatic and paradigmatic representations of term variation. In: Proc. ACL, University of Maryland, pp. 341–348 (1999)Google Scholar
  8. 8.
    Milward, D.: Dynamics, dependency grammar and incremental interpretation. In: Proc. COLING, pp. 1095–1099 (1992)Google Scholar
  9. 9.
    Mulder, N.J., et al.: The InterPro Database, 2003 brings increased coverage and new features. Nucl. Acids. Res. 31, 315–318 (2003)CrossRefGoogle Scholar
  10. 10.
    Park, J.C., Cho, H.J.: Informed parsing for coordination with Combinatory Categorial Grammar. In: Proc. COLING, pp. 593–599 (2000)Google Scholar
  11. 11.
    Raychaudhuri, S., Chang, J.T., Sutphin, P.D., Altman, R.B.: Associating genes with Gene Ontology codes using a Maximum Entropy analysis of biomedical literature. Genome Research 12(1), 203–214 (2002)CrossRefGoogle Scholar
  12. 12.
    Rindflesch, T.C., Tanabe, L., Weinstein, J.N., Hunter, L.: EDGAR: Extraction of drugs, genes and relations from the biomedical literature. In: Proc. Pacific Symposium on Biocomputing, pp. 517–528 (2000)Google Scholar
  13. 13.
    Stapley, B.J., Kelley, L.A., Sternberg, M.J.E.: Predicting the subcellular location of proteins from text using support vector machines. In: Proc. Pacific Symposium on Biocomputing, pp. 374–385 (2002)Google Scholar
  14. 14.
    Steedman, M.: The syntactic process. MIT Press, Cambridge (2000)Google Scholar
  15. 15.
    Xenarios, I., Salwinski, L., Duan, X.J., Higney, P., Kim, S., Eisenberg, D.: DIP: The Database of Interacting Proteins. A research tool for studying cellular networks of protein interactions. Nucl. Acids. Res. 30, 303–305 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jung-jae Kim
    • 1
  • Jong C. Park
    • 1
  1. 1.Korea Advanced Institute of Science and TechnologyDaejeonSouth Korea

Personalised recommendations