Skip to main content

Overview of Statistical Methods for Genome-Wide Association Studies (GWAS)

  • Protocol
  • First Online:
Genome-Wide Association Studies and Genomic Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1019))

Abstract

This chapter provides an overview of statistical methods for genome-wide association studies (GWAS) in animals, plants, and humans. The simplest form of GWAS, a marker-by-marker analysis, is illustrated with a simple example. The problem of selecting a significance threshold that accounts for the large amount of multiple testing that occurs in GWAS is discussed. Population structure causes false positive associations in GWAS if not accounted for, and methods to deal with this are presented. Methodology for more complex models for GWAS, including haplotype-based approaches, accounting for identical by descent versus identical by state, and fitting all markers simultaneously are described and illustrated with examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pritchard JK, Przeworski M (2001) Linkage disequilibrium in humans: models and data. Am J Hum Genet 69:1–14

    Article  PubMed  CAS  Google Scholar 

  2. Luo ZW (1998) Linkage disequilibrium in a two-locus model. Heredity 80:198–208

    Article  PubMed  Google Scholar 

  3. Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics 138:963–971

    PubMed  CAS  Google Scholar 

  4. Dudbridge F, Gusnanto A (2008) Estimation of significance thresholds for genomewide association scans. Genet Epidemiol 32:2227–2234

    Article  Google Scholar 

  5. Fernando RL, Nettleton D, Southey BR, Dekkers JCM, Rothschild MF et al (2004) Controlling the proportion of false positives in multiple dependent tests. Genetics 166:611–619

    Article  PubMed  CAS  Google Scholar 

  6. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57(1):289–300

    Google Scholar 

  7. Weller JI, Song JZ, Heyen DW, Lewin HA, Ron M (1998) A new approach to the problem of multiple comparisons in the genetic dissection of complex traits. Genetics 150:1699–1706

    PubMed  CAS  Google Scholar 

  8. Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B 64:479–498

    Article  Google Scholar 

  9. Pryce JE, Hayes BJ, Bolormaa S, Goddard ME (2011) Polymorphic regions affecting human height also control stature in cattle. Genetics 187(3):981–984

    Article  PubMed  Google Scholar 

  10. Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000) Association mapping in structured populations. Am J Hum Genet 67:170–181

    Article  PubMed  CAS  Google Scholar 

  11. Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–513

    PubMed  CAS  Google Scholar 

  12. MacLeod IM, Hayes BJ, Savin KW, Chamberlain AJ, McPartlan HC, Goddard ME (2010) Power of a genome scan to detect and locate quantitative trait loci in cattle using dense single nucleotide polymorphisms. J Anim Breed Genet 127(2):133–142

    Article  PubMed  CAS  Google Scholar 

  13. Hayes BJ, Goddard ME (2008) Technical note: prediction of breeding values using marker-derived relationship matrices. J Anim Sci 86(9):2089–2092

    Article  PubMed  CAS  Google Scholar 

  14. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2(12):e190

    Article  PubMed  Google Scholar 

  15. McVean G (2009) A genealogical interpretation of principal components analysis. PLoS Genet 5(10):e1000686

    Article  PubMed  Google Scholar 

  16. Daetwyler HD, Kemper KE, van der Werf JH, Hayes BJ (2012) Components of the accuracy of genomic prediction in a multi-breed sheep population. J Anim Sci 2012 May 14 [Epub ahead of print]

    Google Scholar 

  17. Gilmour AR, Gogel BJ, Cullis BR, Welham SJ, Thompson R (2006) ASReml user guide release 2.0. VSN International, Hemel Hempstead, UK

    Google Scholar 

  18. Pryce JE, Bolormaa S, Chamberlain AJ, Bowman PJ, Savin K, Goddard ME, Hayes BJ (2010) A validated genome-wide association study in 2 dairy cattle breeds for milk production and fertility traits using variable length haplotypes. J Dairy Sci 93(7):3331–3345

    Article  PubMed  CAS  Google Scholar 

  19. Meuwissen THE, Goddard ME (2001) Prediction of identity by descent probabilities from marker-haplotypes. Genet Sel Evol 33:605–634

    Article  PubMed  CAS  Google Scholar 

  20. Grapes L, Dekkers JC, Rothschild MF, Fernando RL (2004) Genetics 166:1561

    Article  PubMed  CAS  Google Scholar 

  21. Grapes L, Firat MZ, Dekkers JC, Rothschild MF, Fernando RL (2006) Genetics 172:1955

    Article  PubMed  CAS  Google Scholar 

  22. Zhao HH, Fernando RL, Dekkers JCM (2007) Power and precision of alternate methods for linkage disequilibrium mapping of quantitative trait loci. Genetics 175(1975–1986):27

    Google Scholar 

  23. Hayes BJ, Chamberlain AC, McPartlan H, McLeod I, Sethuraman L, Goddard ME (2007) Accuracy of marker assisted selection with single markers and marker haplotypes in cattle. Genet Res 89:215–220

    Article  PubMed  CAS  Google Scholar 

  24. Calus MP, Meuwissen TH, de Roos AP, Veerkamp RF (2008) Accuracy of genomic selection using different methods to define haplotypes. Genetics 178(1):553–561

    Article  PubMed  CAS  Google Scholar 

  25. Browning SR, Thompson EA (2012) Detecting rare variant associations by identity-by-descent mapping in case-control studies. Genetics 190(4):1521–1531

    Article  PubMed  Google Scholar 

  26. Yang J, Ferreira T, Morris AP, Medland SE, Genetic Investigation of ANthropometric Traits (GIANT) Consortium, DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium, Madden PA, Heath AC, Martin NG, Montgomery GW, Weedon MN, Loos RJ, Frayling TM, McCarthy MI, Hirschhorn JN, Goddard ME, Visscher PM (2012) Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 44(4):369–375, S1–3

    Article  PubMed  CAS  Google Scholar 

  27. Meuwissen THE, Hayes B, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–182933

    PubMed  CAS  Google Scholar 

  28. Verbyla KL, Hayes BJ, Bowman PJ, Goddard ME (2009) Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. Genet Res (Camb) 91(5):307–311

    Article  Google Scholar 

  29. Habier D, Fernando RL, Kizilkaya K, Garrick DJ (2011) Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12:186

    Article  PubMed  Google Scholar 

  30. Veerkamp RF, Verbyla KL, Mulder HA, Calus MP (2010) Simultaneous QTL detection and genomic breeding value estimation using high density SNP chips. BMC Proc 4(Suppl 1):S9

    Article  PubMed  Google Scholar 

  31. Peters SO, Kizilkaya K, Garrick DJ, Fernando RL, Reecy JM, Weaber RL, Silver GA, Thomas MG (2012) Bayesian genome wide association analyses of growth and yearling ultrasound measures of carcass traits in Brangus heifers. J Anim Sci 2012 Jun 4. [Epub ahead of print]

    Google Scholar 

  32. Zeng J, Pszczola M, Wolc A, Strabel T, Fernando RL, Garrick DJ, Dekkers JC (2012) Genomic breeding value prediction and QTL mapping of QTLMAS2011 data using Bayesian and GBLUP methods. BMC Proc 6(Suppl 2):S7

    Article  PubMed  CAS  Google Scholar 

  33. Kizilkaya K, Tait RG, Garrick DJ, Fernando RL, Reecy JM (2011) Whole genome analysis of infectious bovine keratoconjunctivitis in Angus cattle using Bayesian threshold models. BMC Proc 5(Suppl 4):S22

    Article  PubMed  Google Scholar 

  34. Sun X, Habier D, Fernando RL, Garrick DJ, Dekkers JC (2011) Genomic breeding value prediction and QTL mapping of QTLMAS2010 data using Bayesian methods. BMC Proc 5(Suppl 3):S13

    Article  PubMed  Google Scholar 

  35. Erbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, Reich CM, Mason BA, Goddard ME (2012) Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J Dairy Sci 95(7):4114–4129

    Article  PubMed  CAS  Google Scholar 

  36. Meuwissen TH, Goddard ME (2004) Mapping multiple QTL using linkage disequilibrium and linkage analysis information and multitrait data. Genet Sel Evol 36(3):261–279

    Article  PubMed  CAS  Google Scholar 

  37. Hill WG, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet 38:226–231

    Article  Google Scholar 

  38. Lettre G, Jackson AU, Gieger C, Schumacher FR, Berndt SI et al (2008) Identification of ten loci associated with height highlights new biological pathways in human growth. Nat Genet 40:584–591

    Article  PubMed  CAS  Google Scholar 

  39. Gudbjartsson DF, Walters GB, Thorleifsson G, Stefansson H, Halldorsson BV et al (2008) Many sequence variants affecting diversity of adult human height. Nat Genet 40:609–615

    Article  PubMed  CAS  Google Scholar 

  40. Weedon MN, Lango H, Lindgren CM, Wallace C, Evans DM et al (2008) Genome wide association study identifies 20 loci that influence human height. Nat Genet 39:1245–1250

    Article  Google Scholar 

  41. Kim J-J, Lee H-I, Park T, Kim K, Lee J-E et al (2010) Identification of 15 loci influencing height in a Korean population. J Hum Genet 55:27–31

    Article  PubMed  Google Scholar 

  42. Tenesa A, Navarro P, Hayes BJ, Duffy DL, Clarke GM et al (2007) Recent human effective population size estimated from linkage disequilibrium. Genome Res 17:520–526

    Article  PubMed  CAS  Google Scholar 

  43. Bovine Hapmap Consortium (2009) Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science 24:528–532

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Hayes, B. (2013). Overview of Statistical Methods for Genome-Wide Association Studies (GWAS). In: Gondro, C., van der Werf, J., Hayes, B. (eds) Genome-Wide Association Studies and Genomic Prediction. Methods in Molecular Biology, vol 1019. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-447-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-447-0_6

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-446-3

  • Online ISBN: 978-1-62703-447-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics