Abstract
Living organisms need proteins to provide structure, such as skin and bone, and to provide function to the organism through, for example, hormones and enzymes. Genes are translated to proteins after first being transcribed to messenger RNA. Even though every cell of an organism contains the full set of genes for that organism, only a small set of the genes is functional in each cell. The levels at which the different genes are functional in various cell types (their expression levels) can all be screened simultaneously using microarrays. The design of two-channel microarray experiments is discussed and ideas are illustrated through the analysis of data from a designed microarray experiment on gene expression using liver and muscle tissue. The number of genes screened in a microarray experiment can be in the thousands or tens of thousands. So it is important to adjust for the multiplicity of comparisons of gene expression levels because, otherwise, the more genes that are screened, the more likely incorrect statistical inferences are to occur. Different purposes of gene expression experiments may call for different control of multiple comparison error rates. We illustrate how control of the statistical error rate translates into control of the rate of incorrect biological decisions. We discuss the pros and cons of two forms of multiple comparisons inference: testing for significant difference and providing confidence bounds. Two multiple testing principles are described: closed testing and partitioning. Stepdown testing, a popular form of gene expression analysis, is shown to be a shortcut to closed and partitioning testing. We give a set of conditions for such a shortcut to be valid.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alberts, B., Johnson, A., Lewis, J., Raff, M, Roberts, K., and Walter, P. (2002). Molecular Biology of the Cell, fourth edition. Garland, New York.
Beran, R. (1988). Balanced simultaneous confidence sets. Journal of the American Statistical Association, 83, 679–686.
Churchill, G. A. (2002). Fundamentals of experimental design for cDNA microarrays. Nature Genetics, 32, 490–495.
Churchill, G. A. (2003). Discussion of “Statistical challenges in functional genomics.” Statistical Science, 18, 64–69.
Churchill, G. A. and Oliver, B. (2001). Sex, flies and microarrays. Nature Genetics, 29, 355–356.
Dudoit, S. J., Shaffer, P., and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments. Statistical Science, 18, 71–103.
Dudoit, S., Yang, Y. H., Speed, T. P., and Callow, M. J. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 12, 111–139.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall, London.
Efron, B., Tibshirani, R., Storey, J. D., and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association, 96, 1151–1161.
Finner, H. and Roter, M. (2001). On the false discovery rate and expected type I errors. Biometrical Journal, 43, 985–1005.
Finner, H. and Strassburger, K. (2002). The partitioning principle: A powerful tool in multiple decision theory. Annals of Statistics, 30, 1194–1213.
Fritsch, K. and Hsu, J. C. (1997). On analysis of means. In Advances in Statistical Decision Theory and Methodology. Editors: N. Balakrishnan and S. Panchapakesan. Birkhäuser, Boston, 114–119.
Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures. John Wiley and Sons, New York.
Hsu, J. C. (1996). Multiple Comparisons: Theory and Methods. Chapman & Hall, London.
Hsu, J. C., Chang, J. Y., and Wang, T. (2002). Simultaneous confidence intervals for differential gene expressions. Technical Report 592, The Ohio State University, Columbus.
ICH E10 (1999). Choice of Control Groups in Clinical Trials. CPMP (Committee for Propritary Medical Products), EMEA (The European Agency for the Evaluation of Medical Products), London, Draft ICH (International Conference on Harmonisation). Efficiency guidelines, http://www.ich.org.
Kerr, M.K. and Churchill, G. (2001a). Experimental design for gene expression microarrays. Biostatistics, 2, 183–201.
Kerr, M. K. and Churchill, G. (2001b). Statistical design and the analysis of gene expression microarray data. Genetical Research, 77, 123–128.
Kerr, M. K., Martin, M., and Churchill, G. (2000). Analysis of variance for gene expression microarray data. Journal of Computational Biology, 7, 819–837.
Lee, M. T., Kuo, F. C., Whitmore, G. A., and Sklar, J. (2000). Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations. Proceedings of the National Academy of Sciences of the U.S.A., 18, 9834–9839.
Marcus, R., Peritz, E., and Gabriel, K. R. (1976). On closed testing procedures with special reference to ordered analysis of variance. Biometrika, 63, 655–660.
Rao, C. R. (1973). Linear Statistical Inference and Its applications, second edition, John Wiley and Sons, New York.
Stefansson, G., Kim, W., and Hsu, J. C. (1988). On confidence sets in multiple comparisons. In Statistical Decision Theory and Related Topics IV, volume 2. Editors: S. S. Gupta and J. O. Berger, pages 89–104. Springer-Verlag, New York.
Thomas, J. G., Olson, J. M., Tapscott, S. J., and Zhao, L. P. (2001). An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research, 11, 1227–1236.
Tusher, V. G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the U.S.A., 98, 5116–5121.
Vandesompele, J., De Preter, K., Pattyn, F., Poppe, B., Van Roy, N., De Paepe, A., and Speleman, F. (2002). Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biology, 3, 0034.I–0034.II.
Westfall, P. H. and Young, S. S. (1993). Resampling-Based Multiple Testing: Examples and Methods for P-Value Adjustment. John Wiley and Sons, New York.
Wolfinger, R., Gibson, G., Wolfinger, E. D., Bennett, L., Hamadeh, H., Bushel, P., Afshari, C., and Paules, R. (2001). Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology, 8, 625–637.
Yang, Y. H. and Speed, T. P. (2002). Design issues for cDNA microarray experiments. Nature Reviews Genetics, 3, 579–588.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Hsu, J.C., Chang, J.Y., Wang, T. (2006). Screening for Differential Gene Expressions from Microarray Data. In: Dean, A., Lewis, S. (eds) Screening. Springer, New York, NY. https://doi.org/10.1007/0-387-28014-6_6
Download citation
DOI: https://doi.org/10.1007/0-387-28014-6_6
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-28013-4
Online ISBN: 978-0-387-28014-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)