Plant Reverse Genetics pp 27-43 | Cite as
Microarray Data Analysis
- 5 Citations
- 4k Downloads
Abstract
Gene expression profiling has revolutionized functional genomics research by providing a quick handle on all the transcriptional changes that occur in the cell in response to internal or external perturbations or developmental programs. Microarrays have become the most popular technology for recording gene expression profiles. This chapter describes all the necessary steps for analyzing Affymetrix microarray data using the open-source statistical tools (R and bioconductor). The reader is walked through all the basic steps of data analysis: reading raw data, assessing quality, preprocessing/normalization, discovery of differentially expressed genes, comparison of gene lists, functional enrichment analysis, and saving results to files for future reference. Some familiarity with computer is assumed. This chapter is self-contained with installation instructions for R and bioconductor packages along with links to downloadable data and code for reproducing the examples.
Key words
Gene expression Statistical analysis Bioinformatics Differential expression Gene OntologyReferences
- 1.Lockhart, D., Dong, H., Byrne, M., Follettie, M., Gallo, M., Chee, M., et al. (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 14: 1675–1680.PubMedCrossRefGoogle Scholar
- 2.Bolstad, B. M., Irizarry, R. A., Åstrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high-density oligonucleotide array data based on variance and bias. Bioinformatics. 19: 185–193.PubMedCrossRefGoogle Scholar
- 3.Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., et al. (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 4: 249–264.PubMedCrossRefGoogle Scholar
- 4.Smyth, G. (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 3: Article3.Google Scholar
- 5.Benjamini, Y., and Hochberg, Y. (1995) Controlling false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B. 57: 289–300.Google Scholar
- 6.Nettleton, D. (2006) A discussion of statistical methods for design and analysis of microarray experiments for plant scientists. Plant Cell. 18: 2112–2121.PubMedCrossRefGoogle Scholar
- 7.Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 25: 25–29.PubMedCrossRefGoogle Scholar
- 8.Clarke, J. D., and Zhu, T. (2006) Microarray analysis of the transcriptome as a stepping stone towards understanding biological systems: practical considerations and perspectives. Plant J. 45: 630–650.PubMedCrossRefGoogle Scholar
- 9.Allison, D. B., Cui, X., Page, G. P., and Sabripour, M. (2006) Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 7: 55–65.PubMedCrossRefGoogle Scholar
- 10.Cordero, F., Botta, M., and Calogero, R. A. (2007) Microarray data analysis and mining approaches. Brief Funct Genomic Proteomic. 6: 265–281.PubMedCrossRefGoogle Scholar
- 11.Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., et al. (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5: R80.PubMedCrossRefGoogle Scholar
- 12.R Development Core Team. (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. URL http://www.R-project.org.
- 13.Kilian, J., Whitehead, D., Horak, J., Wanke, D., Weinl, S., Batistic, O., et al. (2007) The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J. 50: 347–363.PubMedCrossRefGoogle Scholar
- 14.Falcon, S., and Gentleman, R. (2007) Using GOstats to test gene lists for GO term association. Bioinformatics. 23: 257–258.PubMedCrossRefGoogle Scholar
- 15.Swarbreck, D., Wilks, C., Lamesch, P., Berardini, T. Z., Garcia-Hernandez, M., Foerster, H., et al. (2008) The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36: D1009–D1014.PubMedCrossRefGoogle Scholar
- 16.Wilson, C. L., and Miller, C. J. (2005) Simpleaffy: a BioConductor package for Affymetrix quality control and data analysis. Bioinformatics. 21: 3683–3685.PubMedCrossRefGoogle Scholar
- 17.Gautier, L., Cope, L., Bolstad, B. M., and Irizarry, R. A. (2004) affy – analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 20: 307–315.PubMedCrossRefGoogle Scholar
- 18.Wu, Z., Irizarry, R. A., Gentleman, R., Murillo, F. M., and Spencer, F. (2004) A model based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc. 99: 909–917.CrossRefGoogle Scholar
- 19.Iliev, E. A., Xu, W., Polisensky, D. H., Oh, M. H., Torisky, R. S., Clouse, S. D., et al. (2002) Transcriptional and posttranscriptional regulation of Arabidopsis TCH4 expression by diverse stimuli. Roles of cis regions and brassinosteroids. Plant Physiol. 130: 770–783.PubMedCrossRefGoogle Scholar