Abstract
By providing genome-scale information on gene expression, microarray technology has gained popularity in diverse areas including clinical medicine. However, the analysis and interpretation of microarray data are often complicated. This chapter describes various strategies for microarray data analysis. The analysis starts with the scanned image of a microarray. The image information is processed and summarized to numerical values that represent the abundance of transcripts. Technical variability and systematic biases can be minimized with the proper procedures of background correction and normalization. Considerable numbers of genes are not expressed or not detected by microarray technology. Those genes can be filtered out before further statistical comparison to reduce the dimensionality of the problem. The next step in analysis involves statistical comparison, cluster analysis, and visualization. Genes from the same cluster are considered to be coexpressed and/or coregulated. Also, we can group coexpressed genes into categories by their biological function and cellular location. By combining prior knowledge and statistical results, we can make an inference based on the gene expression profiles.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Janicki, S. M., Tsukamoto, T., Salghettie, S. E., et al. (2004) From silencing to gene expression: real-time analysis in single cells. Cell 116, 683–698.
Storch, K. F., Lipan, O., Leykin, L, et al. (2002) Extensive and divergent circadian gene expression in liver and heart. Nature 417, 78–83.
Kerr, M. K. and Churchill, G. A. (2001) Experimental design for gene expression microarrays. Biostatistics 2, 183–201.
Wilson, C. L., Pepper, S. D., Hey, Y., and Miller, C. J. (2004) Amplification protocols introduce systematic but reproducible errors into gene expression studies. Biotechniques 36, 498–506.
Zien, A., Fluck, J., Zimmer, R., and Lengauer, T. (2003) Microarrays: how many do you need? J. Comput. Biol. 10, 653–667.
Affymetrix technical note. (2004) GeneChip® Expression Platform: Comparison, Evolution, and Performance.
Affymetrix technical manual. (2004) GeneChip Expression Analysis, Data Analysis Fundamentals, http://www.affymetrix.com/support/downloads/manuals/data_analysis_fundamentals_manual.pdf.
Geller, S. C., Gregg, J. P., Hagerman, P., and Rocke, D. M. (2003) Transformation and normalization of oligonucleotide microarray data. Bioinformatics 19, 1817–1823.
Freudenberg, J., Boriss, H., and Hasenclever, D. (2004) Comparison of preprocessing procedures for oligo-nucleotide micro-arrays by parametric bootstrap simulation of spike-in experiments. Methods Inf. Med. 43, 434–438.
Wu, Z. and Irizarry, R. A. (2004) Preprocessing of oligonucleotide array data. Nat. Biotechnol. 22, 656–658; author reply, 658.
Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193.
Yang, Y. H., Dudoit, S., Luu, P., et al. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15.
Choe, S. E., Boutros, M., Michelson, A. M., Church, G. M., and Halfon, M. S. (2005) Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset. Genome Biol. 6, R16.
Johnson, R. A. and Wichern, D. W. (2002) Applied Multivariate Statistical Analysis, 5th ed. Prentice Hall, Englewood Cliffs, NJ.
Bo, T. and Jonassen, I. (2002) New feature subset selection procedures for classification of expression profiles. Genome Biol. 3, 17.
Dudoit, S., Yang, Y. H., Speed, T. P., and Callous, M. J. (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12, 111–139.
Storey, J. D. and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445.
Yang, Y. H., Xiao, Y., and Segal, M. R. (2005) Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21, 1084–1093.
Tusher, V. G., Tibshirani, R., and Chu, G. (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5121.
Baldi, P. and Long, A. D. (2001) A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 17, 509–519.
Kim, R. D. and Park, P. J. (2004) Improving identification of differentially expressed genes in microarray studies using information from public databases. Genome Biol. 5, R70.
Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868.
Tibshirani, R., Walther, G., and Hastie, T. (2001) Estimating the number of clusters in a dataset via the gap statistic. J. R. Statist. Soc. B. 63, 411–423.
Yeung, K. Y., Haynor, D. R., and Ruzzo, W. L. (2001) Validating clustering for gene expression data. Bioinformatics 17, 309–318.
Pomeroy, S. L., Tamayo, P., Gassenbeek, M., et al. (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415, 436–442.
Yanai, I., Benjamin, H., Shmoish, M., et al. (2005) Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21, 650–659.
Liu, G., Loraine, A. L., Shigeta, R., et al. (2003) NetAffx: Affymetrix probesets and annotations. Nucleic Acids Res. 31, 82–86.
Mootha, V. K., Lindgren, C. M., Eriksson, K. F., et al. (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately down-regulated in human diabetes. Nat. Genet. 34, 267–273.
Segal, E., Friedman, N., Koller, D., and Regev, A. (2004) A module map showing conditional activity of expression modules in cancer. Nat. Genet. 36, 1090–1098.
Affymetrix technical note. (2002) GeneChip Eukaryotic Samll Sample Target Labeling Assay Version II. http://www.affymettix.com/support/technical/technotes/smallv2_technote.pdf.
Baugh, L. R., Hill, A. A., Brown, F. L., and Huator, C. P. (2001) Quantitative analysis of mRNA amplification by in vitro transcription. Nucleic Acids Res. 29, E29.
Eberwine, J., et al. (2001) mRna expression analysis of tissue sections and single cells. J. Neurosci. 21, 8310–8314.
Tietjen, I., Rihel, J. M., Cao, Y., Koentges, G., Zakhary, L., and Dulac, C. (2003) Single-cell ttanscriptional analysis of neuronal progenitors. Neuron 38, 161–175.
Kong, S. W., Hwang, K. B., Kim, R. D., et al. (2005) CrossChip: a system supporting comparative analysis of different generations of Affymetrix arrays. Bioinformatics 21, 2116–2117.
Park, P. J., Cao, Y. A., Lee, S. Y., et al. (2004) Current issues for DNA microarrays: platform comparison, double linear amplification, and universal RNA reference. J. Biotechnol. 112, 225–245.
Yuen, T., Wurmbach, E., Pfeffer, R. L., Ebersole, B. J., and Sealfon, S. C. (2002) Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucleic Acids Res. 30, e48.
Kong, S. W., Bodyak, N., Yue, P., et al. (2005) Genetic expression profiles during physiological and pathological cardiac hypertrophy and heart failure in rats. Physiol. Genomics 21, 34–42.
Li, C. and Wong, W. H. (2001) Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 98, 31–36.
Huber, W., von Heydebreck, A., Sultmann, H., Poustka, A., and Vingron, M. (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18Suppl 1, S96–104.
Cope, L. M., Irizarry, R. A., Jaffee, H. A., Wu, Z., and Speed, T. P. (2004) A benchmark for Affymetrix GeneChip expression measures. Bioinformatics 20, 323–331.
Su, A. I., Cooke, M. P., Ching, K. A., et al. (2002) Large-scale analysis of the human and mouse transcriptomes. Proc. Natl. Acad. Sci. USA 99, 4465–4470.
Draghici, S. (2003) Data analysis tools for DNA microarrays. Chapman & Hall/CRC., Boca Raton, FL, p. 393.
Reimers, M. and Weinstein, J. N. (2005) Quality assessment of microarrays: visualization of spatial artifacts and quantitation of regional biases. BMC Bioinformatics 6, 166.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Humana Press Inc.
About this protocol
Cite this protocol
Kong, S.W. (2007). Statistical Methods in Cardiac Gene Expression Profiling. In: Zhang, J., Rokosh, G. (eds) Cardiac Gene Expression. Methods in Molecular Biology, vol 366. Humana Press. https://doi.org/10.1007/978-1-59745-030-0_5
Download citation
DOI: https://doi.org/10.1007/978-1-59745-030-0_5
Publisher Name: Humana Press
Print ISBN: 978-1-58829-352-7
Online ISBN: 978-1-59745-030-0
eBook Packages: Springer Protocols