Skip to main content

Planning and Executing a Genome Wide Association Study (GWAS)

  • Protocol
  • First Online:
Molecular Endocrinology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 590))

Abstract

In recent years, genome-wide association approaches have proven a powerful and successful strategy to identify genetic contributors to complex traits, including a number of endocrine disorders. Their success has meant that genome wide association studies (GWAS) are fast becoming the default study design for discovery of new genetic variants that influence a clinical trait or phenotype. This chapter focuses on a number of key elements that require consideration for the successful conduct of a GWAS. Although many of the considerations are common to any genetic study, the greater cost, extreme multiple testing, and greater openness to data sharing require specific awareness and planning by investigators. In the section on designing a GWAS, we reflect on ethical considerations, study design, selection of phenotype/s, power considerations, sample tracking and storage issues, and genotyping product selection. During execution, important considerations include DNA quantity and preparation, genotyping methods, quality control checks of genotype data, in silico genotyping (imputation), tests of association, and replication of association signals. Although the field of human genetics is rapidly evolving, recent experiences can help guide an investigator in making practical and methodological choices that will eventually determine the overall quality of GWAS results. Given the investment to recruit patient populations or cohorts that are powered for a GWAS, and the still substantial costs associated with genotyping, it is helpful to be aware of these aspects to maximize the likelihood of success, especially where there is an opportunity for implementing them prospectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al.: Initial sequencing and analysis of the human genome. Nature 2001, 409(6822):860–921.

    Article  PubMed  CAS  Google Scholar 

  2. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al.: The sequence of the human genome. Science 2001, 291(5507):1304–1351.

    Article  PubMed  CAS  Google Scholar 

  3. Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, Sherry S, Mullikin JC, Mortimore BJ, Willey DL et al.: A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 2001, 409(6822):928–933.

    Article  PubMed  CAS  Google Scholar 

  4. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M et al.: The structure of haplotype blocks in the human genome. Science 2002, 296(5576):2225–2229.

    Article  PubMed  CAS  Google Scholar 

  5. Olivier M: A haplotype map of the human genome. Physiol Genomics 2003, 13(1):3–9.

    PubMed  CAS  Google Scholar 

  6. The International HapMap Consortium: A haplotype map of the human genome. Nature 2005, 437(7063):1299–1320.

    Google Scholar 

  7. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM et al.: A second generation human haplotype map of over 3.1 million SNPs. Nature 2007, 449(7164):851–861.

    Article  PubMed  CAS  Google Scholar 

  8. de Bakker PI, Yelensky R, Pe'er I, Gabriel SB, Daly MJ, Altshuler D: Efficiency and power in genetic association studies. Nat Genet 2005, 37(11):1217–1223.

    Article  PubMed  Google Scholar 

  9. Gu CC, Yu K, Ketkar S, Templeton AR, Rao DC: On transferability of genome-wide tagSNPs. Genet Epidemiol 2008, 32(2): 89–97.

    Article  PubMed  Google Scholar 

  10. Gu CC, Yu K, Rao DC: Characterization of LD structures and the utility of HapMap in genetic association studies. Adv Genet 2008, 60:407–435.

    Article  PubMed  Google Scholar 

  11. Frayling TM: A new era in finding Type 2 diabetes genes-the unusual suspects. Diabet Med 2007, 24(7):696–701.

    Article  PubMed  CAS  Google Scholar 

  12. Lindgren CM, McCarthy MI: Mechanisms of disease: genetic insights into the etiology of type 2 diabetes and obesity. Nat Clin Pract Endocrinol Metab 2008, 4(3):156–163.

    Article  PubMed  CAS  Google Scholar 

  13. Duffy DL: Genetic determinants of diabetes are similarly associated with other immune-mediated diseases. Curr Opin Allergy Clin Immunol 2007, 7(6):468–474.

    Article  PubMed  CAS  Google Scholar 

  14. Hwang SJ, Yang Q, Meigs JB, Pearce EN, Fox CS: A genome-wide association for kidney function and endocrine-related traits in the NHLBI's Framingham Heart Study. BMC Med Genet 2007, 8 Suppl 1:S10.

    Google Scholar 

  15. Todd JA, Walker NM, Cooper JD, Smyth DJ, Downes K, Plagnol V, Bailey R, Nejentsev S, Field SF, Payne F et al.: Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat Genet 2007, 39(7):857–864.

    Article  PubMed  CAS  Google Scholar 

  16. Richards JB, Rivadeneira F, Inouye M, Pastinen TM, Soranzo N, Wilson SG, Andrew T, Falchi M, Gwilliam R, Ahmadi KR et al.: Bone mineral density, osteoporosis, and osteoporotic fractures: a genome-wide association study. Lancet 2008, 371(9623): 1505–1512.

    Article  PubMed  CAS  Google Scholar 

  17. Kiel DP, Demissie S, Dupuis J, Lunetta KL, Murabito JM, Karasik D: Genome-wide association with bone mass and geometry in the Framingham Heart Study. BMC Med Genet 2007, 8 Suppl 1:S14.

    Google Scholar 

  18. Fox CS, Heard-Costa N, Cupples LA, Dupuis J, Vasan RS, Atwood LD: Genome-wide association to body mass index and waist circumference: the Framingham Heart Study 100 K project. BMC Med Genet 2007, 8 Suppl 1:S18.

    Google Scholar 

  19. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, Perry JR, Elliott KS, Lango H, Rayner NW et al.: A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 2007, 316(5826):889–894.

    Article  PubMed  CAS  Google Scholar 

  20. Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, Najjar S, Nagaraja R, Orru M, Usala G et al.: Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet 2007, 3(7):e115.

    Article  PubMed  Google Scholar 

  21. Weedon MN, Lettre G, Freathy RM, Lindgren CM, Voight BF, Perry JR, Elliott KS, Hackett R, Guiducci C, Shields B et al.: A common variant of HMGA2 is associated with adult and childhood height in the general population. Nat Genet 2007, 39(10):1245–1250.

    Article  PubMed  CAS  Google Scholar 

  22. Lettre G, Jackson AU, Gieger C, Schumacher FR, Berndt SI, Sanna S, Eyheramendy S, Voight BF, Butler JL, Guiducci C et al.: Identification of ten loci associated with height highlights new biological pathways in human growth. Nat Genet 2008, 40(5):584–591.

    Article  PubMed  CAS  Google Scholar 

  23. Sanna S, Jackson AU, Nagaraja R, Willer CJ, Chen WM, Bonnycastle LL, Shen H, Timpson N, Lettre G, Usala G et al.: Common variants in the GDF5-UQCC region are associated with variation in human height. Nat Genet 2008, 40(2):198–203.

    Article  PubMed  CAS  Google Scholar 

  24. Weedon MN, Lango H, Lindgren CM, Wallace C, Evans DM, Mangino M, Freathy RM, Perry JR, Stevens S, Hall AS et al.: Genome-wide association analysis identifies 20 loci that influence adult height. Nat Genet 2008, 40(5):575–583.

    Article  PubMed  CAS  Google Scholar 

  25. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN: Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet 2008, 9(5):356–369.

    Article  PubMed  CAS  Google Scholar 

  26. Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K, Bagoutdinov R, Hao L, Kiang A, Paschall J, Phan L et al.: The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 2007, 39(10):1181–1186.

    Article  PubMed  CAS  Google Scholar 

  27. Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 2008, 4(8):e1000167.

    Article  PubMed  Google Scholar 

  28. Martin ER, Monks SA, Warren LL, Kaplan NL: A test for linkage and association in general pedigrees: the pedigree disequilibrium test. Am J Hum Genet 2000, 67(1):146–154.

    Article  PubMed  CAS  Google Scholar 

  29. Chen WM, Abecasis GR: Family-based association tests for genomewide association scans. Am J Hum Genet 2007, 81(5):913–926.

    Article  PubMed  CAS  Google Scholar 

  30. Wellcome Trust Case Control Consortium: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007, 447(7145):661–678.

    Google Scholar 

  31. Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, Allen JE, Downes K, Barrett JC, Healy BC, Mychaleckyj JC et al.: Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat Genet 2008.

    Google Scholar 

  32. Zondervan KT, Cardon LR: Designing candidate gene and genome-wide case-control association studies. Nat Protoc 2007, 2(10):2492–2501.

    Article  PubMed  CAS  Google Scholar 

  33. Amos CI: Successful design and conduct of genome-wide association studies. Hum Mol Genet 2007, 16 Spec No. 2:R220–225.

    Google Scholar 

  34. Kraft P, Cox DG: Study designs for genome-wide association studies. Adv Genet 2008, 60:465–504.

    Article  PubMed  CAS  Google Scholar 

  35. Cupples LA: Family study designs in the age of genome-wide association studies: experience from the Framingham Heart Study. Curr Opin Lipidol 2008, 19(2):144–150.

    Article  PubMed  CAS  Google Scholar 

  36. Skol AD, Scott LJ, Abecasis GR, Boehnke M: Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 2006, 38(2):209–213.

    Article  PubMed  CAS  Google Scholar 

  37. Anderson CA, Pettersson FH, Barrett JC, Zhuang JJ, Ragoussis J, Cardon LR, Morris AP: Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms. Am J Hum Genet 2008, 83(1):112–119.

    Article  PubMed  CAS  Google Scholar 

  38. Glasel JA: Validity of nucleic acid purities monitored by 260 nm/280 nm absorbance ratios. Biotechniques 1995, 18(1):62–63.

    PubMed  CAS  Google Scholar 

  39. Steemers FJ, Chang W, Lee G, Barker DL, Shen R, Gunderson KL: Whole-genome genotyping with the single-base extension assay. Nat Methods 2006, 3(1):31–33.

    Article  PubMed  CAS  Google Scholar 

  40. Illumina Inc.: Infinium HD Assay Super, Manual − Experienced User Card. In.: Part # 11294825.

    Google Scholar 

  41. Wigginton JE, Cutler DJ, Abecasis GR: A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 2005, 76(5):887–893.

    Article  PubMed  CAS  Google Scholar 

  42. Pritchard JK, Donnelly P: Case-control studies of association in structured or admixed populations. Theor Popul Biol 2001, 60(3):227–237.

    Article  PubMed  CAS  Google Scholar 

  43. Marchini J, Cardon LR, Phillips MS, Donnelly P: The effects of human population structure on large genetic association studies. Nat Genet 2004, 36(5):512–517.

    Article  PubMed  CAS  Google Scholar 

  44. Teo YY: Common statistical issues in genome-wide association studies: a review on power, data quality control, genotype calling and population structure. Curr Opin Lipidol 2008, 19(2):133–143.

    Article  PubMed  CAS  Google Scholar 

  45. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D: Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006, 38(8):904–909.

    Article  PubMed  CAS  Google Scholar 

  46. Li Q, Yu K: Improved correction for population stratification in genome-wide association studies by identifying hidden population structures. Genet Epidemiol 2008, 32(3):215–226.

    Article  PubMed  CAS  Google Scholar 

  47. Cheung VG, Spielman RS, Ewens KG, Weber TM, Morley M, Burdick JT: Mapping determinants of human gene expression by regional and genome-wide association. Nature 2005, 437(7063):1365–1369.

    Article  PubMed  CAS  Google Scholar 

  48. Li Y, Abecasis GR: Mach 1.0: Rapid haplotype reconstruction and missing genotype inference. American Journal of Human Genetics 2006, S79:2290.

    Google Scholar 

  49. Marchini J, Howie B, Myers S, McVean G, Donnelly P: A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 2007, 39(7):906–913.

    Article  PubMed  CAS  Google Scholar 

  50. Huang L, Li Y, Singleton AB, Hardy JA, AbeCasis G, Rosenberg NA, Scheet P: Genotype-imputation accuracy across worldwide human populations. American Journal of Human Genetics 2009, 84(2):230–250.

    Google Scholar 

  51. Chen WM, Erdos MR, Jackson AU, Saxena R, Sanna S, Silver KD, Timpson NJ, Hansen T, Orru M, Grazia Piras M et al.: Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels. J Clin Invest 2008, 118(7):2620–2628.

    PubMed  CAS  Google Scholar 

  52. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007, 81(3):559–575.

    Article  PubMed  CAS  Google Scholar 

  53. Langefeld CD, Fingerlin TE: Association methods in human genetics. Methods Mol Biol 2007, 404:431–460.

    Article  PubMed  CAS  Google Scholar 

  54. Senn S: Transposed conditionals, shrinkage, and direct and indirect unbiasedness. Epidemiology 2008, 19(5):652–654; discussion 657–658.

    Google Scholar 

  55. Ioannidis JP: Why most discovered true associations are inflated. Epidemiology 2008, 19(5):640–648.

    Article  PubMed  Google Scholar 

  56. Kraft P: Curses—winner's and otherwise—in genetic epidemiology. Epidemiology 2008, 19(5):649–651; discussion 657-648.

    Google Scholar 

  57. Willett WC: The search for truth must go beyond statistics. Epidemiology 2008, 19(5):655–656.

    Google Scholar 

  58. de Bakker PI, Ferreira MA, Jia X, Neale BM, Raychaudhuri S, Voight BF: Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 2008, 17(R2):R122-128.

    Article  Google Scholar 

  59. Kavvoura FK, Ioannidis JP: Methods for meta-analysis in genetic association studies: a review of their potential and pitfalls. Hum Genet 2008, 123(1):1–14.

    Article  PubMed  Google Scholar 

  60. Ioannidis JP: Non-replication and inconsistency in the genome-wide association setting. Hum Heredity 2007, 64(4):203–213.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We wish to thank Andrew Singleton, Ph.D., National Institute on Aging, Kathleen H. Day, University of Virginia and Fang-Chi Hsu, Ph.D., Wake Forest University School of Medicine, for helpful discussion and comments.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Sale, M.M., Mychaleckyj, J.C., Chen, WM. (2009). Planning and Executing a Genome Wide Association Study (GWAS). In: Park-Sarge, OK., Curry, T. (eds) Molecular Endocrinology. Methods in Molecular Biology, vol 590. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-378-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-378-7_25

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60327-377-0

  • Online ISBN: 978-1-60327-378-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics