Ontology-Driven Approaches to Analyzing Data in Functional Genomics
Ontologies are fundamental knowledge representations that provide not only standards for annotating and indexing biological information, but also the basis for implementing functional classification and interpretation models. This chapter discusses the application of gene ontology (GO) for predictive tasks in functional genomics. It focuses on the problem of analyzing functional patterns associated with gene products. This chapter is divided into two main parts. The first part overviews GO and its applications for the development of functional classification models. The second part presents two methods for the characterization of genomic information using GO. It discusses methods for measuring functional similarity of gene products, and a tool for supporting gene expression clustering analysis and validation.
Key WordsGene ontology clustering expression data similarity functional genomics
We thank Oliver Bodenreider for helpful advice on ontologies. This work was supported by grant BIO2001-0068 from the Ministerio de Ciencia y Tecnología. F.A. was partly supported by a visiting fellowship from the US National Library of Medicine.
- 11.Budanitsky, A. and Hirst, G. (2001) Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures, in Workshop on WordNet and Other Lexical Resources, Pittsburgh.Google Scholar
- 12.Resnik, P. (1995) Using information content to evaluate semantic similarity in a taxonomy, in Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada (Mellish, C. S., ed.), Morgan Kaufman, San Mateo, CA, pp. 448–453.Google Scholar
- 13.Lin, D. (1998) An information-theoretic definition of similarity, in Proceedings of the 15th International Conference on Machine Learning, Montreal, Canada (Mellish, C. S., ed.), Morgan Kaufman, San Mateo, CA, pp. 296–304.Google Scholar
- 21.Mateos, A., Herrero, J., Tamames, J., and Dopazo, J. (2002) Supervised neural networks for clustering conditions in DNA array data after reducing noise by clustering gene expression profiles, in Methods of Microarray Data Analysis II (Lin, S. and Johnson, K., eds.), Kluwer, Boston, MA.Google Scholar
- 22.Slonim, D. K. (2002) From patterns to pathways: gene expression data analysis comes of age. Nat. Genet. (Suppl. The Chipping Forecast) 32, 502–508.Google Scholar
- 23.Westfall, P. H. and Young, S. S. (1993) Resampling-Based Multiple Testing, John Wiley & Sons, New York.Google Scholar
- 24.Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300.Google Scholar
- 31.Diaz-Uriarte, R., Al-Shahrour, F., and Dopazo, J. (2003) Use of GO terms to understand the biological significance of microarray differential gene expression data, in Methods of Microarray Data Analysis III (Lin, S. and Johnson, K., eds.), Kluwer, Boston, MA; in press.Google Scholar