Abstract
A series of large-scale Arabidopsis thaliana microarray expression experiments profiling genome-wide expression across different developmental stages, cell types, and environmental conditions have resulted in tremendous amounts of gene expression data. This gene expression is the output of complex transcriptional regulatory networks and provides a starting point for identifying the dominant transcriptional regulatory modules acting within the plant. Highly co-expressed groups of genes are likely to be regulated by similar transcription factors. Therefore, finding these co-expressed groups can reduce the dimensionality of complex expression data into a set of dominant transcriptional regulatory modules. Determining the biological significance of these patterns is an informatics challenge and has required the development of new methods. Using these new methods we can begin to understand the biological information contained within large-scale expression data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Busch, W. and Lohmann, J.U. (2007) Profiling a plant: expression analysis in Arabidopsis. Current Opinion in Plant Biology 10(2), 136--141.
Schmid, M., Davison, T.S., Henz, S.R., et al. (2005) A gene expression map of Arabidopsis thaliana development. Nature Genetics 37(5), 501--506.
Nemhauser, J.L., Hong, F., and Chory, J. (2006) Different plant hormones regulate similar processes through largely nonoverlapping transcriptional responses. Cell 126(3), 467--475.
Kilian, J., Whitehead, D., Horak, J., et al. (2007) The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant Journal 50(2), 347--363.
Birnbaum, K., Jung, J.W., Wang, J.Y., et al. (2005) Cell type-specific expression profiling in plants via cell sorting of protoplasts from fluorescent reporter lines. Nature Methods 2(8), 615--619.
Birnbaum, K., Shasha, D.E., Wang, J.Y., et al. (2003) A gene expression map of the Arabidopsis root. Science 302(5652), 1956--1960.
Brady, S.M., Orlando, D.A., Lee , J.-Y., et al. (2007) A high-resolution root spatiotemporal map reveals dominant expression patterns. Science 318(5851), 801--806.
Kaufman, L. and Rousseeuw, P.J. (1990) Finding Groups in Data: An Introduction to Cluster Analysis. New York: Wiley.
Ashburner, M., Ball, C.A., Blake, J.A., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 25, 25--29.
Swarbreck, D., Wilks, C., Lamesch, P., et al. (2007) The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Research, gkm965.
Guo, A., He, K., Liu, D., et al. (2005) DATF: a database of Arabidopsis transcription factors. Bioinformatics 21(10), 2568--2569.
Higo, K., Ugawa, Y., Iwamoto, M., and Korenaga, T. (1999) Plant cis-acting regulatory DNA elements (PLACE) database. Nucleic Acids Research 27(1), 297--300.
Palaniswamy, S.K., James, S., Sun, H., Lamb, R.S., Davuluri, R.V., and Grotewold, E. (2006) AGRIS and AtRegNet. A platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiology 140(3), 818--829.
Brown, D.M., Zeef , L.A.H., Ellis, J., Goodacre, R., Turner, S.R. (2005) Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17(8), 2281--2295.
Jones, M.A., Raymond, M.J., and Smirnoff, N. (2006) Analysis of the root-hair morphogenesis transcriptome reveals the molecular identity of six genes with roles in root-hair development in Arabidopsis. Plant Journal 45(1), 83--100.
Menges, M., de Jager, S.M., Gruissem, W., Murray, J.A.H. (2005) Global analysis of the core cell cycle regulators of Arabidopsis identifies novel genes, reveals multiple and highly specific profiles of expression and provides a coherent model for plant cell cycle control. Plant Journal 41(4), 546--566.
Persson, S., Wei, H., Milne, J., Page, G.P., and Somerville, C.R. (2005) Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proceedings of the National Academy of Sciences of the United States of America 102(24), 8633--8638.
Gadbury, G.L., Garrett, K.A., and Allison, D.B. Challenges and approaches to statistical design and inference in high dimensional investigations. In this volume.
Boyle, E.I., Weng, S., Gollub, J., et al. (2004) GO::TermFinder -- open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 18, 3710--3715.
O'Connor, T.R., Dyreson, C., and Wyrick, J.J. (2005) Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences. Bioinformatics 24, 4411--4413.
Team RDC. (2006) R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.
Iida, K., Seki, M., Sakurai, T., et al. (2005) RARTF: database and tools for complete sets of Arabidopsis transcription factors. DNA Research 12, 247--256.
Storey, J.D. (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B 64, 479--498.
Maechler, M., Rousseeuw, P.J., Hubert, M., and Hornik, K. (2007) Cluster: Cluster Analysis Basics and Extensions. In R package version 1.11. 9 ed.
Gasch, A. and Eisen, M. (2002) Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology 3(11): research0059.1--research 22.
Tibshirani, R., Walther, G., and Hastie, T. (2000) Estimating the number of clusters in a dataset via the gap statistic. Technical Report 208. Department of Statistics, Stanford University.
Levine, D.M., Haynor, D.R., Castle, J.C., et al. (2006) Pathway and gene-set activation measurement from mRNA expression data: the tissue distribution of human pathways. Genome Biology 7(10), R93.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Orlando, D.A., Brady, S.M., Koch, J.D., Dinneny, J.R., Benfey, P.N. (2009). Manipulating Large-Scale Arabidopsis Microarray Expression Data: Identifying Dominant Expression Patterns and Biological Process Enrichment. In: Belostotsky, D. (eds) Plant Systems Biology. Methods in Molecular Biology™, vol 553. Humana Press. https://doi.org/10.1007/978-1-60327-563-7_4
Download citation
DOI: https://doi.org/10.1007/978-1-60327-563-7_4
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60327-562-0
Online ISBN: 978-1-60327-563-7
eBook Packages: Springer Protocols