Advertisement

Statistics in Biosciences

, Volume 9, Issue 1, pp 28–49 | Cite as

Robust Bayesian FDR Control Using Bayes Factors, with Applications to Multi-tissue eQTL Discovery

  • Xiaoquan WenEmail author
Article

Abstract

Motivated by the genomic application of expression quantitative trait loci (eQTL) mapping, we propose a new procedure to perform simultaneous testing of multiple hypotheses using Bayes factors as input test statistics. One of the most significant features of this method is its robustness in controlling the targeted false discovery rate even under misspecifications of parametric alternative models. Moreover, the proposed procedure is highly computationally efficient, which is ideal for treating both complex system and big data in genomic applications. We discuss the theoretical properties of the new procedure and demonstrate its power and computational efficiency in applications of single-tissue and multi-tissue eQTL mapping.

Keywords

False Discovery Rate Null Model eQTL Mapping False Discovery Rate Control False Discovery Rate Level 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

We thank Debashis Ghosh, Matthew Stephens, and Timothee Flutre for their fruitful discussion and feedbacks. We are grateful for the insightful comments from the two anonymous reviewers. This work is supported by the NIH Grant R01 MH101825 (PI: M.Stephens).

References

  1. 1.
    Barreiro LB, Tailleux L, Pai AA, Gicquel B, Marioni JC, Gilad Y (2012) Deciphering the genetic architecture of variation in the immune response to Mycobacterium tuberculosis infection. Proc Natl Acad Sci USA 109(4):1204–1209ADSCrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57(1):289–300MathSciNetzbMATHGoogle Scholar
  3. 3.
    Cox DR, Hinkley D (1979) Theoretical statistics. Chapman & Hall, LondonzbMATHGoogle Scholar
  4. 4.
    De la Cruz O, Wen X, Ke B, Song M, Nicolae DL (2010) Gene, region and pathway level analyses in whole-genome studies. Genet Epidemiol 34(3):222–231PubMedPubMedCentralGoogle Scholar
  5. 5.
    DiCiccio TJ, Kass RE, Raftery A, Wasserman L (1997) Computing Bayes factors by combining simulation and asymptotic approximations. J Am Stat Assoc 92(439):903–915MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Dimas AS, Deutsch S, Stranger BE, Montgomery SB et al (2009) Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325(5945):1246–1250ADSCrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical bayes analysis of a microarray experiment. J Am Stat Assoc 96(456):1151–1160MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Flutre T, Wen X, Pritchard J, Stephens M (2013) A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet 9(5):e1003486CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Genovese C, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 32(3):1035–1061MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Good I (1992) The bayes/non-bayes compromise: a brief review. J Am Stat Assoc 87(419):597–606MathSciNetCrossRefGoogle Scholar
  11. 11.
    Ji Y, Lu Y, Mills GB (2008) Bayesian models based on test statistics for multiple hypothesis testing problems. Bioinformatics 24(7):943–949CrossRefPubMedGoogle Scholar
  12. 12.
    Johnson VE (2005) Bayes factors based on test statistics. J R Stat Soc Ser B 67(5):689–701MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Johnson VE (2008) Properties of Bayes factors based on test statistics. Scand J Stat 35(2):354–368MathSciNetCrossRefGoogle Scholar
  14. 14.
    Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Liang F, Paulo R, Molina G, Clyde MA, Berger JO (2008) Mixtures of g priors for Bayesian variable selection. J Am Stat Assoc 103(481):410–423MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Müller P, Parmigiani G, Robert C, Rousseau J (2004) Optimal sample size for multiple testing: the case of gene expression microarrays. J Am Stat Assoc 99(468):990–1001MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Müller P, Parmigiani G, Rice K (2006) FDR and Bayesian multiple comparisons rules. In: Bayesian statistics 8, vol 0. Oxford University Press, p 349–370Google Scholar
  18. 18.
    Newton MA, Noueiry A, Sarkar D, Ahlquist P (2004) Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 5(2):155–76CrossRefPubMedzbMATHGoogle Scholar
  19. 19.
    Opgen-Rhein R, Strimmer K (2007) Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat Appl Genet Mol Biol 6Google Scholar
  20. 20.
    Raftery AE (1996) Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83(2):251–266MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Saville BR, Herring AH (2009) Testing random effects in the linear mixed model using approximate bayes factors. Biometrics 65(2):369–376MathSciNetCrossRefPubMedPubMedCentralzbMATHGoogle Scholar
  22. 22.
    Servin B, Stephens M (2007) Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet 3(7):e114CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B 64(3):479–498MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Storey JD (2003) The positive false discovery rate: a Bayesian interpretation and the q-value. Ann Stat 31(6):2013–2035MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Storey JD (2007) The optimal discovery procedure: a new approach to simultaneous significance testing. J R Stat Soc Ser B 69(3):347–368MathSciNetCrossRefGoogle Scholar
  26. 26.
    Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445ADSMathSciNetCrossRefPubMedPubMedCentralzbMATHGoogle Scholar
  27. 27.
    Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc Ser B 66(1):187–205MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Sun W, Cai TT (2007) Oracle and adaptive compound decision rules for false discovery rate control. J Am Stat Assoc 102(479):901–912MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Wakefield J (2009) Bayes factors for genome-wide association studies: comparison with P-values. Genet Epidemiol 33(1):79–86CrossRefPubMedGoogle Scholar
  30. 30.
    Wen X (2014) Bayesian model selection in complex linear systems, as illustrated in genetic association studies. Biometrics 70(1):73–83MathSciNetCrossRefPubMedzbMATHGoogle Scholar
  31. 31.
    Wen X, Stephens M (2014) Bayesian methods for genetic association analysis with heterogeneous subgroups: from meta-analyses to gene-environment interactions. Ann Appl Stat 8(1):176–203MathSciNetCrossRefPubMedPubMedCentralzbMATHGoogle Scholar
  32. 32.
    Whittemore AS (2007) A Bayesian false discovery rate for multiple testing. J Appl Stat 34(1):1–9MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© International Chinese Statistical Association 2016

Authors and Affiliations

  1. 1.Department of BiostatisticsUniversity of MichiganAnn ArborUSA

Personalised recommendations