Gene Selection with the δ-Sequence Method

Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 972)

Abstract

In this chapter, we discuss a method of selecting differentially expressed genes based on a newly discovered structure termed as the δ-sequence. Together with the nonparametric empirical Bayes methodology, it leads to dramatic gains in terms of the mean numbers of true and false discoveries, and in the stability of the results of testing. Furthermore, its outcomes are entirely free from the log-additive array-specific technical noise. The new paradigm offers considerable scope for future developments in this area of methodological research.

Key words

Microarray data Correlation Differential expression Gene pairs 

Notes

Acknowledgments

This research is supported by NIH Grant GM079259 (X. Qiu) and by Theodosius Dobzhansky Center for Genome Bioinformatics (L. Klebanov).

References

  1. 1.
    Efron B (2003) Robbins, empirical Bayes and microarrays. Ann Stat 31:366–378CrossRefGoogle Scholar
  2. 2.
    Efron B (2004) Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. J Am Stat Assoc 99:96–104CrossRefGoogle Scholar
  3. 3.
    Efron B, Tibshrani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96:1151–1160CrossRefGoogle Scholar
  4. 4.
    Allison DB, Gadbury GL, Heo M, Fern’andez JR, Les C-K, Prolla JA, Weindruch R (2002) A mixture model approach for the analysis of microarray gene expression data. Comput Stat Data Anal 39:1–20CrossRefGoogle Scholar
  5. 5.
    Dalmasso C, Broët P, Moreau T (2005) A simple procedure for estimating the false discovery rate. Bioinformatics 21(5):660–668PubMedCrossRefGoogle Scholar
  6. 6.
    Pounds S, Cheng C (2004) Improving false discovery rate estimation. Bioinformatics 20:1737–1745PubMedCrossRefGoogle Scholar
  7. 7.
    Pounds S, Morris SW (2003) Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics 19:1236–1242PubMedCrossRefGoogle Scholar
  8. 8.
    Reiner A, Yekutieli D, Benjamini Y (2003) Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19:368–375PubMedCrossRefGoogle Scholar
  9. 9.
    Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B 64:479–498CrossRefGoogle Scholar
  10. 10.
    Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445PubMedCrossRefGoogle Scholar
  11. 11.
    Tsai C-A, Hsueh H-M, Chen JJ (2003) Estimation of false discovery rates in multiple testing: application to gene microarray data. Biometrics 59:1071–1081PubMedCrossRefGoogle Scholar
  12. 12.
    Qiu X, Klebanov L, Yakovlev A (2005) Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes. Stat Appl Genet Mol Biol 4:34Google Scholar
  13. 13.
    Qiu X, Yakovlev A (2006) Some comments on instability of false discovery rate estimation. J Bioinform Comput Biol 4(5):1057–1068PubMedCrossRefGoogle Scholar
  14. 14.
    Storey JD, Taylor JE, Siegmund D (2003) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc Ser B 66:187–205CrossRefGoogle Scholar
  15. 15.
    Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100:9440–9445PubMedCrossRefGoogle Scholar
  16. 16.
    Qiu X, Brooks AI, Klebanov L, Yakovlev A (2005) The effects of normalization on the correlation structure of microarray data. BMC Bioinform 6:120CrossRefGoogle Scholar
  17. 17.
    Klebanov L, Jordan C, Yakovlev A (2006) A new type of stochastic dependence revealed in gene expression data. Stat Appl Genet Mol Biol 5 (Article7)Google Scholar
  18. 18.
    Almudevar A, Klebanov LB, Qiu X, Salzman P, Yakovlev AY (2006) Utility of correlation measures in analysis of gene expression. NeuroRx 3(3):384–395PubMedCrossRefGoogle Scholar
  19. 19.
    Klebanov L, Yakovlev A (2006) Treating expression levels of different genes as a sample in microarray data analysis: is it worth a risk? Stat Appl Genet Mol Biol 5, Article 9Google Scholar
  20. 20.
    Qiu X, Xiao Y, Gordon A, Yakovlev A (2006) Assessing stability of gene selection in microarray data analysis. BMC Bioinform 7:50CrossRefGoogle Scholar
  21. 21.
    Benjamini Y, Hochberg Y (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 25(1):60Google Scholar
  22. 22.
    Yeoh E-J, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG, Raimondi SC, Relling MV, Patel A, Cheng C, Campana D, Wilkins D, Zhou X, Li J, Liu H, Pui C-H, Evans WE, Naeve C, Wong L, Downing JR (2002) Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2):133–143PubMedCrossRefGoogle Scholar
  23. 23.
    Westfall PH, Young S (1993) Resampling-based multiple testing. Wiley, New York, NYGoogle Scholar
  24. 24.
    Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP (2002) Normalization for cdna microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 30(4):e15PubMedCrossRefGoogle Scholar
  25. 25.
    Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185–193PubMedCrossRefGoogle Scholar
  26. 26.
    Gordon A, Glazko G, Qiu X, Yakovlev A (2007) Control of the mean number of false discoveries, Bonferroni, and stability of multiple testing. Ann Appl Stat 1(1):179–190CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Biostatistics and Computational BiologyUniversity of RochesterRochesterUSA
  2. 2.Department of Probability Statistics Charles University PraguePragueCzech Republic

Personalised recommendations