A Kernel Method Used for the Analysis of Replicated Micro-array Experiments

  • Ali Gannoun
  • Beno Liquetît
  • Jérôme Saracco
  • Wolfgang Urfer


Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional t-test and the normal mixture model.


Microarray Experiment Kernel Method Kernel Estimator Normal Mixture Null Statistic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baldi, P. and Long, A. D. (2001). A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics, 17, 509–519.CrossRefGoogle Scholar
  2. Bosq, D. and Lecoutre, J. P. (1987). Théorie de l’estimation fonctionnelle. Economica: Paris.Google Scholar
  3. Brown, P. O. and Botstein, D. (1999). Exploring the New World of the genome with DNA microarrays. Nature Genetics, 21, 33–37.CrossRefGoogle Scholar
  4. Chen, Y., Dougherty, E. R. and Bittner, M. (1999). Ratio-based decisions and the quantitative analysis of cDNA microarray images. Biomedical Optics, 2, 364–374.CrossRefGoogle Scholar
  5. Cline, D. B. H. and Hart, J. D. (1991). Kernel estimation of densities with discontinuities or discontinuous derivatives. Statistics, 22, 69–84.zbMATHMathSciNetGoogle Scholar
  6. Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data, via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39, 1–38.zbMATHMathSciNetGoogle Scholar
  7. Deheuvels, P. (1977). Estimation non paramétrique de la densité par histogrammes généralisés. Revue de Statistique Appliquée, 25, 35–42.MathSciNetGoogle Scholar
  8. Dudoit, S., Yang, Y. H., Speed, T. P. and Callow, M. J. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 12, 111–139.zbMATHMathSciNetGoogle Scholar
  9. Efron, B., Tibshirani, R., Goss, V. and Chu, G. (2000). Microarrays and their use in a comparative experiment. Technical report: Stanford University.Google Scholar
  10. Efron, B., Storey, J. and Tibshirani, R. (2001). Microarrays, empirical Bayes methods, and false discovery rates. Technical report:Univ. California, Berkeley.Google Scholar
  11. Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., and Lander, E. S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.CrossRefGoogle Scholar
  12. Hall, P. and Wehrly, T.E. (1991). A geometrical method for removing edge effects from kernel-type nonparametric regression estimators. J. Amer. Stat. Assoc., 86, 665–672.MathSciNetCrossRefGoogle Scholar
  13. Hall, P. and Yao, Q. (1991). Nonparametric estimation and symetry tests for conditional density function. Journal of Nonparametric Statistics, 14, 259–278.Google Scholar
  14. Kerr, M. K., Martin, M. and Churchill, G.A. (2000). Analysis of variance for gene expression microarray data. Journal of Computational Biology, 7, 819–837.CrossRefGoogle Scholar
  15. Lander, E. S. (1999). Array of hope. Nature Genetics, 21, 3–4.CrossRefGoogle Scholar
  16. Lee, M. L. T., Kuo, F. C., Whitmore, G. A. and Sklar, J. (2000). Importance of microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations. Proceedings of the National Academy of Sciences of the United States of America, 97, 9834–9839.zbMATHCrossRefGoogle Scholar
  17. Li, C. and Wong, W.H. (2001). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Sciences of the United States of America, 98, 31–36.zbMATHCrossRefGoogle Scholar
  18. McLachlan, G. and Peel, D. (1999). The EMMIX Algorithm for the Fitting of Normal and t-Components. Journal of Statistical Software, 4 ( Scholar
  19. Newton, M. A., Kendziorski, C. M., Richmond, C. S., Blattner, F.R. and Tsui, K. W. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology, 8, 37–52.CrossRefGoogle Scholar
  20. Newton, M. A., Noueiry, A., Sarkar, D. and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics, 5, 155–176.zbMATHCrossRefGoogle Scholar
  21. Pan, W. (2002). A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics, 12, 546–554.CrossRefGoogle Scholar
  22. Pan, W., Lin, J. and Le, C. T. (2004). A mixture model approach to detecting differentially expressed genes with microarray data. Functional and Integrative Genomics, (To appear).Google Scholar
  23. Press, W. H., Teukolsky, C. M., Vetterling, W. T. and Flannery, B. P. (1992). Numerical recipes in C, The Art of Scientific Computing. 2nd ed. Cambridge: New York.zbMATHGoogle Scholar
  24. Quackenbush, J. (2001). Computational analysis of microarray data. Nature Reviews-Genetics, 2, 418–427.CrossRefGoogle Scholar
  25. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.zbMATHMathSciNetGoogle Scholar
  26. Schuster, E. F. (1985). Incorporating support constraints into nonparametric estimation of densities. Communications in Statistics, Theory and Methods, 14, 1123–1136.zbMATHMathSciNetGoogle Scholar
  27. Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. Chapman & Hall, London.Google Scholar
  28. Thomas, J. G., Olson, J. M., Tapscott, S. J. and Zhao, L. P. (2001 ). An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research, 11, 1227–1236.CrossRefGoogle Scholar
  29. Tusher, V.G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 98, 5116–5121.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Ali Gannoun
  • Beno Liquetît
  • Jérôme Saracco
  • Wolfgang Urfer

There are no affiliations available

Personalised recommendations