Bioinformatics pp 245-255 | Cite as

Expression and Microarrays

  • Joaquín Dopazo
  • Fátima Al-Shahrour
Part of the Methods in Molecular Biology™ book series (MIMB, volume 453)


High throughput methodologies have increased by several orders of magnitude the amount of experimental microarray data available. Nevertheless, translating these data into useful biological knowledge remains a challenge. There is a risk of perceiving these methodologies as mere factories that produce never-ending quantities of data if a proper biological interpretation is not provided.

Methods of interpreting these data are continuously evolving. Typically, a simple two-step approach has been used, in which genes of interest are first selected based on thresholds for the experimental values, and then enrichment in biologically relevant terms in the annotations of these genes is analyzed in a second step. For various reasons, such methods are quite poor in terms of performance and new procedures inspired by systems biology that directly address sets of functionally related genes are currently under development.

Key words

Functional interpretation functional genomics multiple testing gene ontology 



This work is supported by grants from MEC BIO2005-01078, NRC Canada-SEPOCT Spain, and Fundación Genoma España.


  1. 1.
    Stelzl, U., Worm, U., Lalowski, M., Haenig, C., et al. (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968.PubMedCrossRefGoogle Scholar
  2. 2.
    Hallikas, O., Palin, K., Sinjushina, N., et al. (2006) Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 124, 47–59.PubMedCrossRefGoogle Scholar
  3. 3.
    Rual, J. F., Venkatesan, K., Hao, T., et al. (2005) Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178.CrossRefGoogle Scholar
  4. 4.
    Lee, H. K., Hsu, A. K., Sajdak, J., et al. (2004) Coexpression analysis of human genes across many microarray data sets. Genome Res 14, 1085–1094.PubMedCrossRefGoogle Scholar
  5. 5.
    Stuart, J. M., Segal, E., Koller, D., et al. (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255.PubMedCrossRefGoogle Scholar
  6. 6.
    van Noort, V., Snel, B., Huynen, M. A. (2003) Predicting gene function by conserved co-expression. Trends Genet 19, 238–242.PubMedCrossRefGoogle Scholar
  7. 7.
    Mateos, A., Dopazo, J., Jansen, R., et al. (2002) Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons. Genome Res 12, 1703–1715.PubMedCrossRefGoogle Scholar
  8. 8.
    Westerhoff, H. V., Palsson, B. O. (2004) The evolution of molecular biology into systems biology. Nat Biotechnol 22, 1249–1252.PubMedCrossRefGoogle Scholar
  9. 9.
    Golub, T. R., Slonim, D. K., Tamayo, P., et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537.PubMedCrossRefGoogle Scholar
  10. 10.
    Ashburner, M., Ball, C. A., Blake, J. A., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–29.PubMedCrossRefGoogle Scholar
  11. 11.
    Kanehisa, M., Goto, S., Kawashima, S., et al. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32, D277–280.PubMedCrossRefGoogle Scholar
  12. 12.
    Robertson, G., Bilenky, M., Lin, K., et al. (2006) cisRED: a database system for genome-scale computational discovery of regulatory elements. Nucleic Acids Res 34, D68–73.PubMedCrossRefGoogle Scholar
  13. 13.
    Wingender, E., Chen, X., Hehl, R., et al. (2000) TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 28, 316–319.PubMedCrossRefGoogle Scholar
  14. 14.
    Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res 33, D201–205.PubMedCrossRefGoogle Scholar
  15. 15.
    Draghici, S., Khatri, P., Martins, R. P., et al. (2003) Global functional profiling of gene expression. Genomics 81, 98–104.PubMedCrossRefGoogle Scholar
  16. 16.
    Al-Shahrour, F., Diaz-Uriarte, R., Dopazo, J. (2004) FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20, 578–580.PubMedCrossRefGoogle Scholar
  17. 17.
    Zeeberg, B. R., Feng, W., Wang, G., et al. (2003) GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol 4, R28.PubMedCrossRefGoogle Scholar
  18. 18.
    Khatri, P., Draghici, S. (2005) Ontologi-cal analysis of gene expression data: current tools, limitations, and open problems. Bio-informatics 21, 3587–3595.Google Scholar
  19. 19.
    Bolshakova, N., Azuaje, F., Cunningham, P. (2005) A knowledge-driven approach to cluster validity assessment. Bioinformatics 21, 2546–2547.PubMedCrossRefGoogle Scholar
  20. 20.
    Bammler, T., Beyer, R. P., Bhattacharya, S., et al. (2005) Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods 2, 351–356.PubMedCrossRefGoogle Scholar
  21. 21.
    Mootha, V. K., Lindgren, C. M., Eriksson, K. F., et al. (2003) PGC-1alpha-respon-sive genes involved in oxidative phospho-rylation are coordinately downregulated in human diabetes. Nat Genet 34, 267–273.PubMedCrossRefGoogle Scholar
  22. 22.
    Subramanian, A., Tamayo, P., Mootha, V. K., et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545–15550.PubMedCrossRefGoogle Scholar
  23. 23.
    Damian, D., Gorfine, M. (2004) Statistical concerns about the GSEA procedure. Nat Genet 36, 663.PubMedCrossRefGoogle Scholar
  24. 24.
    Al-Shahrour, F., Diaz-Uriarte, R., Dopazo, J. (2005) Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information. Bioinformatics 21, 2988–2993.PubMedCrossRefGoogle Scholar
  25. 25.
    Goeman, J. J., Oosting, J., Cleton-Jansen, A. M., et al. (2005) Testing association of a pathway with survival using gene expression data. Bioinformatics 21, 1950–1957.PubMedCrossRefGoogle Scholar
  26. 26.
    Goeman, J. J., van de Geer, S. A., de Kort, F., et al. (2004) A global test for groups of genes: testing association with a clinical outcome. Bioinformatics, 20, 93–99.CrossRefGoogle Scholar
  27. 27.
    Tian, L., Greenberg, S. A., Kong, S. W., et al. (2005) Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci U S A 102, 13544–13549.PubMedCrossRefGoogle Scholar
  28. 28.
    Smid, M., Dorssers, L. C. (2004) GO-Mapper: functional analysis of gene expression data using the expression level as a score to evaluate Gene Ontology terms. Bioinfor-matics 20, 2618–2625.CrossRefGoogle Scholar
  29. 29.
    Vencio, R., Koide, T., Gomes, S., et al. (2006) BayGO: Bayesian analysis of ontology term enrichment in microarray data. BMC Bioinformatics 7, 86.PubMedCrossRefGoogle Scholar
  30. 30.
    Kim, S. Y., Volsky, D. J. (2005) PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics 6, 144.PubMedCrossRefGoogle Scholar
  31. 31.
    Chen, Z., Wang, W., Ling, X. B., et al. (2006) GO-Diff: Mining functional differentiation between EST-based transcrip-tomes. BMC Bioinformatics 7, 72.PubMedCrossRefGoogle Scholar
  32. 32.
    Al-Shahrour, F., Minguez, P., Tarraga, J., et al. (2006) BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments. Nucleic Acids Res 34, W472–476.PubMedCrossRefGoogle Scholar
  33. 33.
    Al-Shahrour, F., Minguez, P., Vaquerizas, J. M., et al. (2005) BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments. Nucleic Acids Res, 33, W460–464.CrossRefGoogle Scholar
  34. 34.
    Huang, D., Pan, W. (2006) Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data. Bioinformatics 22, 1259–1268.PubMedCrossRefGoogle Scholar
  35. 35.
    Pan, W. (2006) Incorporating gene functions as priors in model-based clustering of microarray gene expression data. Bioinfor-matics 22, 795–801.CrossRefGoogle Scholar
  36. 36.
    Jia, Z., Xu, S. (2005) Clustering expressed genes on the basis of their association with a quantitative phenotype. Genet Res 86, 193–207.PubMedCrossRefGoogle Scholar
  37. 37.
    Eisen, M. B., Spellman, P. T., Brown, P. O., et al. (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95, 14863–14868.PubMedCrossRefGoogle Scholar
  38. 38.
    Wolfe, C.J., Kohane, I. S., and Butte, A. J. (2005) Systematic survey reveals general applicability of “guilt-by-association” within gene coexpression networks. BMC Bioinfor-matics 6, 227.CrossRefGoogle Scholar
  39. 39.
    Barry, W. T., Nobel, A. B., and Wright, F. A. (2005) Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinfor-matics 21, 1943–1949.CrossRefGoogle Scholar
  40. 40.
    Benjamini, Y., Yekutieli, D. (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29, 1165–1188.CrossRefGoogle Scholar
  41. 41.
    Herrero, J., Al-Shahrour, F., Diaz-Uriar te, R., et al. (2003) GEPAS: A web-based resource for microarray gene expression data analysis. Nucleic Acids Res 31, 3461–3467.PubMedCrossRefGoogle Scholar
  42. 42.
    Herrero, J., Vaquerizas, J. M., Al-Shahrour, F., et al. (2004) New challenges in gene expression data analysis and the extended GEPAS. Nucleic Acids Res 32, W485–491.PubMedCrossRefGoogle Scholar
  43. 43.
    Vaquerizas, J. M., Conde, L., Yankilevich, P., et al. (2005) GEPAS, an experiment-oriented pipeline for the analysis of microar-ray gene expression data. Nucleic Acids Res 33, W616–620.PubMedCrossRefGoogle Scholar
  44. 44.
    Lottaz, C., Spang, R. (2005) Molecular decomposition of complex clinical pheno-types using biologically structured analysis of microarray data. Bioinformatics 21, 1971–1978.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Joaquín Dopazo
    • 1
  • Fátima Al-Shahrour
    • 1
  1. 1.Department of BioinformaticsCentro de Investigación Príncipe Felipe (CIPF)ValenciaSpain

Personalised recommendations