Ontology-Driven Approaches to Analyzing Data in Functional Genomics

  • Francisco Azuaje
  • Fatima Al-Shahrour
  • Joaquin Dopazo
Part of the Methods in Molecular Biology book series (MIMB, volume 316)


Ontologies are fundamental knowledge representations that provide not only standards for annotating and indexing biological information, but also the basis for implementing functional classification and interpretation models. This chapter discusses the application of gene ontology (GO) for predictive tasks in functional genomics. It focuses on the problem of analyzing functional patterns associated with gene products. This chapter is divided into two main parts. The first part overviews GO and its applications for the development of functional classification models. The second part presents two methods for the characterization of genomic information using GO. It discusses methods for measuring functional similarity of gene products, and a tool for supporting gene expression clustering analysis and validation.

Key Words

Gene ontology clustering expression data similarity functional genomics 



We thank Oliver Bodenreider for helpful advice on ontologies. This work was supported by grant BIO2001-0068 from the Ministerio de Ciencia y Tecnología. F.A. was partly supported by a visiting fellowship from the US National Library of Medicine.


  1. 1.
    The Gene Ontology Consortium. (2001) Creating the gene ontology resource: design and implementation. Genome Res. 11, 1425–1433.CrossRefGoogle Scholar
  2. 2.
    Ouzounis, C., Coulson, R., Enright, A., Kunin, V., and Pereira-Leal, J. (2003) Classification schemes for protein structure and function. Nat. Rev. Genet. 4, 508–519.PubMedCrossRefGoogle Scholar
  3. 3.
    Harris, M. and Parkinson, H. (2003) Standards and ontologies for functional genomics: towards unified ontologies for biology and biomedicine. Compar. Funct. Genomics 4, 116–120.CrossRefGoogle Scholar
  4. 4.
    Bard, J. (2003) Ontologies: formalising biological knowledge for bioinformatics. BioEssays 25, 501–506.PubMedCrossRefGoogle Scholar
  5. 5.
    King, O., Lee, J., Dudley, A., Jansen, D., Church, G., and Roth, F. (2003) Predicting phenotype from patterns of annotation. Bioinformatics 19(Suppl. 1), 183–189.CrossRefGoogle Scholar
  6. 6.
    Hvidsten, T., Laegreid, A., and Komorowski, J. (2003) Learning rule-based models of biological process from gene expression time profiles using Gene Ontology. Bioinformatics 19, 1116–1123.PubMedCrossRefGoogle Scholar
  7. 7.
    King, O., Foulger, R., Dwight, S., White, J., and Roth, F. (2003) Predicting gene function from patterns of annotation. Genome Res. 13, 896–904.PubMedCrossRefGoogle Scholar
  8. 8.
    Laegreid, A., Hvidsten, T., Midelfart, H., Komorowski, J., and Sandvik, A. (2003) Predicting gene ontology biological process from temporal gene expression patterns. Genome Res. 13, 965–979.CrossRefGoogle Scholar
  9. 9.
    Iyer, V., Eisen, M., Ross, D., et al. (1999) The transcriptional program in the response of human fibroblast to serum. Science 283, 83–87.PubMedCrossRefGoogle Scholar
  10. 10.
    Zhong, J., Zhu, H., Li, Y., and Yu, Y. (2002) Conceptual graph matching for semantic search, in Conceptual Structures: Integration and Interfaces (Priss, U., Corbett, D., and Angelova, G., eds.), Springer Verlag, London, UK, pp. 92–106.CrossRefGoogle Scholar
  11. 11.
    Budanitsky, A. and Hirst, G. (2001) Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures, in Workshop on WordNet and Other Lexical Resources, Pittsburgh.Google Scholar
  12. 12.
    Resnik, P. (1995) Using information content to evaluate semantic similarity in a taxonomy, in Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada (Mellish, C. S., ed.), Morgan Kaufman, San Mateo, CA, pp. 448–453.Google Scholar
  13. 13.
    Lin, D. (1998) An information-theoretic definition of similarity, in Proceedings of the 15th International Conference on Machine Learning, Montreal, Canada (Mellish, C. S., ed.), Morgan Kaufman, San Mateo, CA, pp. 296–304.Google Scholar
  14. 14.
    Lord, P., Stevens, R., Brass, A., and Goble, C. (2003) Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 19, 1275–1283.PubMedCrossRefGoogle Scholar
  15. 15.
    Cho, R., Campbell, M., Winzeler, E., et al. (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2, 65–73.PubMedCrossRefGoogle Scholar
  16. 16.
    Al-Shahrour, F., Diaz-Uriarte, R., and Dopazo, J. (2003) FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20, 578–580 (epub).CrossRefGoogle Scholar
  17. 17.
    Khatri, P., Draghici, S., Ostermeier, G. C., and Krawetz, S. A. (2002) Profiling gene expression using onto-express. Genomics 79, 1–5.CrossRefGoogle Scholar
  18. 18.
    Doniger, S. W., Salomonis, N., Dahlquist, K. D., Vranizan, K., Lawlor, S. C., and Conklin, B. R. (2003) MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data. Genome Biol. 4, R7.PubMedCrossRefGoogle Scholar
  19. 19.
    Robinson, M. D., Grigull, J., Mohammad, N., and Hughes, T. R. (2002) FunSpect: a web-based cluster interpreter for yeast. BMC Bioinformatics 3, 1–5.CrossRefGoogle Scholar
  20. 20.
    Zeeberg, B. R., Feng, W., Wang, G., et al. (2003) GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4(4), R28.1–R28.8.CrossRefGoogle Scholar
  21. 21.
    Mateos, A., Herrero, J., Tamames, J., and Dopazo, J. (2002) Supervised neural networks for clustering conditions in DNA array data after reducing noise by clustering gene expression profiles, in Methods of Microarray Data Analysis II (Lin, S. and Johnson, K., eds.), Kluwer, Boston, MA.Google Scholar
  22. 22.
    Slonim, D. K. (2002) From patterns to pathways: gene expression data analysis comes of age. Nat. Genet. (Suppl. The Chipping Forecast) 32, 502–508.Google Scholar
  23. 23.
    Westfall, P. H. and Young, S. S. (1993) Resampling-Based Multiple Testing, John Wiley & Sons, New York.Google Scholar
  24. 24.
    Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300.Google Scholar
  25. 25.
    Benjamini, Y. and Yekutieli, D. (2001) The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188.CrossRefGoogle Scholar
  26. 26.
    Eisen, M., Spellman, P. L., Brown, P. O., and Botsein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14,863–14,868.PubMedCrossRefGoogle Scholar
  27. 27.
    Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botsein, D., Brown, P. O., and Herskowitz, I. (1998) The transcriptional program sporulation in budding yeast. Science 282, 699–705.PubMedCrossRefGoogle Scholar
  28. 28.
    Herrero, J., Valencia, A., and Dopazo, J. (2001) A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics 17, 126–136.PubMedCrossRefGoogle Scholar
  29. 29.
    Herrero, J., Al-Shahrour, F., Diaz-Uriarte, R., et al. (2003) GEPAS, a web-based resource for microarray gene expression data analysis. Nucleic Acids Res. 31, 3461–3467.PubMedCrossRefGoogle Scholar
  30. 30.
    Pritchard, C. C., Hsu, L., and Nelson, P. S. (2001) Project normal: defining normal variance in mouse gene expression. Proc. Natl. Acad. Sci. USA 98, 13,266–13,271.PubMedCrossRefGoogle Scholar
  31. 31.
    Diaz-Uriarte, R., Al-Shahrour, F., and Dopazo, J. (2003) Use of GO terms to understand the biological significance of microarray differential gene expression data, in Methods of Microarray Data Analysis III (Lin, S. and Johnson, K., eds.), Kluwer, Boston, MA; in press.Google Scholar
  32. 32.
    Mota, V. K., Lindgren, C. M., Eriksson, K. F., et al. (2003) PGC-1-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273.CrossRefGoogle Scholar

Copyright information

© Humana Press Inc. 2006

Authors and Affiliations

  • Francisco Azuaje
    • 1
  • Fatima Al-Shahrour
    • 2
  • Joaquin Dopazo
    • 2
  1. 1.Computer Science Research InstituteUniversity of UlsterNorthern IrelandUK
  2. 2.Department of BioinformaticsCentro de Investigacion Principe FelipeValenciaSpain

Personalised recommendations