Abstract
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional t-test and the normal mixture model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
Baldi, P. and Long, A. D. (2001). A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics, 17, 509–519.
Bosq, D. and Lecoutre, J. P. (1987). Théorie de l’estimation fonctionnelle. Economica: Paris.
Brown, P. O. and Botstein, D. (1999). Exploring the New World of the genome with DNA microarrays. Nature Genetics, 21, 33–37.
Chen, Y., Dougherty, E. R. and Bittner, M. (1999). Ratio-based decisions and the quantitative analysis of cDNA microarray images. Biomedical Optics, 2, 364–374.
Cline, D. B. H. and Hart, J. D. (1991). Kernel estimation of densities with discontinuities or discontinuous derivatives. Statistics, 22, 69–84.
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data, via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39, 1–38.
Deheuvels, P. (1977). Estimation non paramétrique de la densité par histogrammes généralisés. Revue de Statistique Appliquée, 25, 35–42.
Dudoit, S., Yang, Y. H., Speed, T. P. and Callow, M. J. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 12, 111–139.
Efron, B., Tibshirani, R., Goss, V. and Chu, G. (2000). Microarrays and their use in a comparative experiment. Technical report: Stanford University.
Efron, B., Storey, J. and Tibshirani, R. (2001). Microarrays, empirical Bayes methods, and false discovery rates. Technical report:Univ. California, Berkeley.
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., and Lander, E. S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.
Hall, P. and Wehrly, T.E. (1991). A geometrical method for removing edge effects from kernel-type nonparametric regression estimators. J. Amer. Stat. Assoc., 86, 665–672.
Hall, P. and Yao, Q. (1991). Nonparametric estimation and symetry tests for conditional density function. Journal of Nonparametric Statistics, 14, 259–278.
Kerr, M. K., Martin, M. and Churchill, G.A. (2000). Analysis of variance for gene expression microarray data. Journal of Computational Biology, 7, 819–837.
Lander, E. S. (1999). Array of hope. Nature Genetics, 21, 3–4.
Lee, M. L. T., Kuo, F. C., Whitmore, G. A. and Sklar, J. (2000). Importance of microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations. Proceedings of the National Academy of Sciences of the United States of America, 97, 9834–9839.
Li, C. and Wong, W.H. (2001). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Sciences of the United States of America, 98, 31–36.
McLachlan, G. and Peel, D. (1999). The EMMIX Algorithm for the Fitting of Normal and t-Components. Journal of Statistical Software, 4 (http://www.jstatsoft.org/).
Newton, M. A., Kendziorski, C. M., Richmond, C. S., Blattner, F.R. and Tsui, K. W. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology, 8, 37–52.
Newton, M. A., Noueiry, A., Sarkar, D. and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics, 5, 155–176.
Pan, W. (2002). A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics, 12, 546–554.
Pan, W., Lin, J. and Le, C. T. (2004). A mixture model approach to detecting differentially expressed genes with microarray data. Functional and Integrative Genomics, (To appear).
Press, W. H., Teukolsky, C. M., Vetterling, W. T. and Flannery, B. P. (1992). Numerical recipes in C, The Art of Scientific Computing. 2nd ed. Cambridge: New York.
Quackenbush, J. (2001). Computational analysis of microarray data. Nature Reviews-Genetics, 2, 418–427.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Schuster, E. F. (1985). Incorporating support constraints into nonparametric estimation of densities. Communications in Statistics, Theory and Methods, 14, 1123–1136.
Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. Chapman & Hall, London.
Thomas, J. G., Olson, J. M., Tapscott, S. J. and Zhao, L. P. (2001 ). An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research, 11, 1227–1236.
Tusher, V.G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 98, 5116–5121.
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Gannoun, A., Liquetît, B., Saracco, J., Urfer, W. (2007). A Kernel Method Used for the Analysis of Replicated Micro-array Experiments. In: Statistical Methods for Biostatistics and Related Fields. Springer, Berlin, Heidelberg . https://doi.org/10.1007/978-3-540-32691-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-32691-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32690-8
Online ISBN: 978-3-540-32691-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)