Advertisement

Bayesian Methods Applied to GWAS

  • Rohan L. Fernando
  • Dorian Garrick
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1019)

Abstract

Bayesian multiple-regression methods are being successfully used for genomic prediction and selection. These regression models simultaneously fit many more markers than the number of observations available for the analysis. Thus, the Bayes theorem is used to combine prior beliefs of marker effects, which are expressed in terms of prior distributions, with information from data for inference. Often, the analyses are too complex for closed-form solutions and Markov chain Monte Carlo (MCMC) sampling is used to draw inferences from posterior distributions. This chapter describes how these Bayesian multiple-regression analyses can be used for GWAS. In most GWAS, false positives are controlled by limiting the genome-wise error rate, which is the probability of one or more false-positive results, to a small value. As the number of test in GWAS is very large, this results in very low power. Here we show how in Bayesian GWAS false positives can be controlled by limiting the proportion of false-positive results among all positives to some small value. The advantage of this approach is that the power of detecting associations is not inversely related to the number of markers.

Key words

GWAS Bayesian multiple-regression Genomic prediction MCMC sampling R-scripts 

References

  1. 1.
    Maher B (2008) The case of the missing heritability. Nature 456:18–21PubMedCrossRefGoogle Scholar
  2. 2.
    Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA, Visscher PM (2009) Finding the missing heritability of complex diseases. Nature 461(7265):747–753. doi:10.1038/nature08494. http://www.hubmed.org/display.cgi?uids=19812666 Google Scholar
  3. 3.
    Visscher PM, Yang J, Goddard ME (2010) A commentary on ‘common SNPs explain a large proportion of the heritability for human height’ by Yang et al. (2010). Twin Res Hum Genet 13(6):517–524. doi:10.1375/twin.13.6.517. http://www.hubmed.org/display.cgi?uids=21142928
  4. 4.
    Onteru SK, Fan B, Nikkilä MT, Garrick DJ, Stalder KJ, Rothschild MF (2010) Whole-genome association analyses for lifetime reproductive traits in the pig. J Anim Sci. doi:10.2527/jas.2010-3236. http://jas.fass.org/content/early/2010/12/23/jas.2010-3236.abstract
  5. 5.
    Hayes BJ, Pryce J, Chamberlain AJ, Bowman PJ, Goddard ME (2010) Genetic architecture of complex traits and accuracy of genomic prediction: coat colour, milk-fat percentage, and type in Holstein cattle as contrasting model traits. PLoS Genet 6(9):e1001139. doi:10.1371/journal.pgen.1001139. http://dx.doi.org/10.1371%5C;%2Fjournal.pgen.1001139
  6. 6.
    Fan B, Onteru SK, Du Z-Q, Garrick DJ, Stalder KJ, Rothschild MF (2011) Genome-wide association study identifies loci for body composition and structural soundness traits in pigs. PLoS One 6(2):e14726. doi:10.1371/journal.pone.0014726. http://dx.doi.org/10.1371%5C;%2Fjournal.pone.0014726
  7. 7.
    Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829PubMedGoogle Scholar
  8. 8.
    Sun X, Habier D, Fernando RL, Garrick D, Garrick DJ, Dekkers JCM (2011) Genomic breeding value prediction and QTL mapping of QTLMAS-2010 data using Bayesian methods. BMC Proc 5(Suppl 3):S13PubMedCrossRefGoogle Scholar
  9. 9.
    Southey BR, Fernando RL (1998) Controlling the proportion of false positives among significant results in QTL detection. In: Proceedings of the 6th world congress on genetics applied to livestock production, vol 26, Armidale, pp 221–224Google Scholar
  10. 10.
    Fernando RL, Nettleton D, Southey B, Dekkers J, Rothschild M, Soller M (2004) Controlling the proportion of false positives in multiple dependent tests. Genetics 166:611–619PubMedCrossRefGoogle Scholar
  11. 11.
    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300Google Scholar
  12. 12.
    Stephens M, Balding DJ (2009) Bayesian statistical methods for genetic association studies. Nat Rev Genet 10(10):681–690. doi:10.1038/nrg2615. http://www.hubmed.org/display.cgi?uids=19763151 Google Scholar
  13. 13.
    Gianola D, Fernando RL, Stella A (2006) Genomic assisted prediction of genetic value with semi-parametric procedures. Genetics 173:1761–1776PubMedCrossRefGoogle Scholar
  14. 14.
    Yi N, Xu S, Allison DB (2003) Bayesian model choice and search strategies for mapping interacting quantitative trait loci. Genetics 165:867–883PubMedGoogle Scholar
  15. 15.
    Sorensen DA, Gianola D (2002) Likelihood, Bayesian, and MCMC methods in quantitative genetics. Springer, New YorkGoogle Scholar
  16. 16.
    Habier D, Fernando RL, Kizilkaya K, Garrick D (2011) Extension of the Bayesian alphabet for genomic selection. BMC Bioinform 12:186CrossRefGoogle Scholar
  17. 17.
    Henderson CR (1984) Applications of linear models in animal breeding. Univ. Guelph, GuelphGoogle Scholar
  18. 18.
    Gianola D, Fernando RL (1986) Bayesian methods in animal breeding. J Anim Sci 63:217–244Google Scholar
  19. 19.
    Fernando RL, Gianola D (1986) Optimal properties of the conditional mean as a selection criterion. Theor Appl Genet 72:822–825Google Scholar
  20. 20.
    Habier D, Fernando RL, Dekkers JCM (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics 177(4):2389–2397. doi:10.1534/genetics.107.081190. http://www.genetics.org/cgi/content/abstract/177/4/2389 Google Scholar
  21. 21.
    Habier D, Tetens J, Seefried F-R, Lichtner P, Thaller G (2010) The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet Sel Evol 42(1):5. ISSN 1297-9686. doi:10.1186/1297-9686-42-5. http://www.gsejournal.org/content/42/1/5 Google Scholar
  22. 22.
    Zeng J, Pszczola M, Wolc A, Strabel T, Fernando R, Garrick D, Dekkers J (2012) Genomic breeding value prediction and qtl mapping of qtlmas2011 data using Bayesian and gblup methods. BMC Proc 6(Suppl 2):S7. ISSN 1753-6561. doi:10.1186/1753-6561-6-S2-S7. http://www.biomedcentral.com/1753-6561/6/S2/S7
  23. 23.
    Gianola D, de los Campos G, Hill WG, Manfredi E, Fernando R (2009) Additive genetic variability and the Bayesian alphabet. Genetics 183(1):347–363. doi:10.1534/genetics.109.103952. http://www.hubmed.org/display.cgi?uids=19620397 Google Scholar
  24. 24.
    Gilks WR, Roberts GO (1996) Strategies for improving MCMC. In: Gilks WR, Richardson S, Spielgelhalter DJ (eds) Markov chain Monte Carlo in practice, 1st edn. Chapman and Hall, London, pp 1–19Google Scholar
  25. 25.
    Norris JR (1997) Markov chains. Cambridge series on statistical and probabilistic mathematics. Cambridge University Press, New YorkCrossRefGoogle Scholar
  26. 26.
    Hastings WK (1970) Monte Carlo sampling using Markov chains and their applications. Biometrika 57:97–109CrossRefGoogle Scholar
  27. 27.
    Sahana G, Guldbrandtsen B, Janss L, Lund MS (2010) Comparison of association mapping methods in a complex pedigreed population. Genet Epidemiol 34:455–462PubMedCrossRefGoogle Scholar
  28. 28.
    R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org
  29. 29.
    Fernando RL, Garrick DJ (2008) GenSel—user manual for a portfolio of genomic selection related analyses. Animal Breeding and Genetics, Iowa State University, AmesGoogle Scholar
  30. 30.
    Morton N (1955) Sequential tests for the detection of linkage. Am J Hum Genet 7:277–318PubMedGoogle Scholar
  31. 31.
    Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B 64:479–498CrossRefGoogle Scholar
  32. 32.
    Fernando RL, Habier D, Stricker C, Dekkers JCM, Totir LR (2007) Genomic selection. Acta Agric Scand Sect A Anim Sci 57(4):192–195. http://www.informaworld.com/10.1080/09064700801959395 Google Scholar
  33. 33.
    Tierney L (1996) Introduction to general state-space Markov chain theory. In: Gilks WR, Richardson S, Spielgelhalter DJ (eds) Markov chain Monte Carlo in practice. Chapman and Hall, LondonGoogle Scholar
  34. 34.
    Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. Comput Graph Stat 7:434–455Google Scholar
  35. 35.
    Godsill SJ (2001) On the relationship between Markov chain Monte Carlo methods for model uncertainty. J Comput Graph Stat 10(2):230–248CrossRefGoogle Scholar
  36. 36.
    Carlin BP, Chib S (1995) Bayesian model choice via Markov-chain Monte-Carlo methods. J R Stat Soc Ser B Methodol 57(3):473–484. ISSN 0035-9246Google Scholar
  37. 37.
    Cannings C, Sheehan N (2002) On a misconception about irreducibility of the single-site Gibbs sampler in a pedigree application. Genetics 162:993–996PubMedGoogle Scholar
  38. 38.
    Fernández SA, Fernando RL, Gulbrandtsen B, Stricker C, Schelling M, Carriquiry AL (2002) Irreducibility and efficiency of ESIP to sample marker genotypes in large pedigrees with loops. Genet Sel Evol 34:537–555PubMedCrossRefGoogle Scholar
  39. 39.
    Abraham KJ, Totir L, Fernando R (2007) Improved techniques for sampling complex pedigrees with the Gibbs sampler. Genet Sel Evol 39(1):27–38. ISSN 1297-9686Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2013

Authors and Affiliations

  • Rohan L. Fernando
    • 1
  • Dorian Garrick
    • 1
  1. 1.Department of Animal ScienceIowa State UniversityAmesUSA

Personalised recommendations